ISyE Tackling the Challenges of Big Data | H. Milton Stewart School of Industrial and Systems Engineering

Apr 25, 2013

Every day, billions of bytes of data are generated from product realization, purchasing transactions, information collected from health centers, and more. Engineers in the Stewart School of Industrial & Systems Engineering (ISyE) are tackling the challenges of big data sets to solve logistics problems in high volume distribution centers, improve manufacturing processes, build predictive models, and make advancements in healthcare, to name a few. Some of these projects happening in ISyE were highlighted in the Fall 2012 – Winter 2013 issue of Research Horizons, “Tackling the Challenges of Big Data.”

Below are highlights of some of the ISyE projects that appear in this issue:

Unraveling Logistics Problems

High volume distribution centers – whether they serve WalMart, Home Depot, or the Department of Defense – ship hundreds of thousands of items to many destinations daily. If these facilities can systematically save a few seconds of labor here or a centimeter of space there, the total efficiency gain can be significant.

Yet achieving big savings requires finding patterns in huge data sets. Engineers must analyze thousands or millions of customer orders and then use that information to optimize warehouse layouts and processes.

John Bartholdi is the Manhattan Associates Chair of Supply Chain Management in the H. Milton Stewart School of Industrial and Systems Engineering. He’s working on warehousing optimization for the Defense Logistics Agency, and has also performed similar research for numerous corporations.

“We build tools to automate the search for exploitable patterns, which can hide in vast data sets,” said Bartholdi, who is also research director of the Supply Chain and Logistics Institute. “We analyze huge histories of customer orders, just like Amazon does. But instead of doing it to tune advertising and drive sales, we do it to tune the warehouse and the entire supply chain, to drive efficiencies.”

Understanding the Manufacturing Process

Jianjun (Jan) Shi, the Carolyn J. Stewart Chair and professor in the School of Industrial and Systems Engineering, employs a multi-disciplinary data fusion approach to improving manufacturing processes that involve massive information sets. Shi combines data, statistical methods, signal processing, control theory, and domain knowledge to solve manufacturing problems.

“We frequently analyze data from a factory’s information system to monitor system status and performance via system informatics and control techniques,” Shi said. “We then develop automated algorithms that can be implemented directly into production systems for performance improvement.”

Among his projects:

Working with major automobile manufacturers, Shi has introduced “Stream of Variation” technology that monitors multistage assembly stations to reduce variations in manufacturing processes. The resulting information is used to pinpoint the cause of any variation problems in the final product.
In research for a large number of U.S. and international steel companies, Shi and his team have developed data fusion algorithms for inline sensing and defect detection for product quality and production efficiency improvements. That software has been implemented in a dozen real-world production systems.

Building Predictive Models

Xiaoming Huo focuses on statistical modeling of large, diverse data sets. Huo, a School of Industrial and Systems Engineering professor, uses existing data to build predictive models – tools that forecast probable outcomes.

“That’s a distinct challenge,” Huo said, “because each data set is large and complex and its useful features are unknown.” He works in areas that include geophysics, automatic control, engineering signal modeling, financial data analysis, and smart environment design.

Often, the data appear in the form of images, and Huo must develop feature-extraction methods customized for each problem.

“Given the size of the data and limitations on the number of features that can be utilized, the task of searching for useful data points I truly like searching for needles in a haystack,” said Huo, who teaches both computational statistics and financial data. “Defining the predictors – the variables that you are going to utilize to build the statistical model – is the hardest question.”

Among the approaches he uses are signal and image processing methods, along with inputs of “domain knowledge” – expert knowledge of the domain in question.

In one recent geophysical project, Huo’s goal was to separate desired features from many similar ones. His data source was a lard one – a 3-D image produced by some 8,000 sensors detecting manmade sonic vibrations in the earth over a 10-kilometer area.

Huo used automated image processing techniques, including Fourier domain techniques that analyze signals with respect to frequency rather than time. He extracted desired high frequency data, resulting in a ground structure image that offered important information to petroleum geologists.

Predicting Drug Response

Ming Yuan, an associate professor in the School of Industrial and Systems Engineering, is using computational and mathematical approaches to analyze how gene expression evolves over time in individuals with breast cancer – and whether these patterns can predict treatment outcomes.

Yuan is studying how gene expression evolves during the menstrual cycle and whether there is any association between these patterns and cancer relapse. Gene expression determines how much biochemical material results from a gene, and can be used to judge how active a gene is.

“Our goal is to weed out the genes that just change expression level due to a woman’s menstrual cycle and not because of tumor progression or treatment,” explained Yuan, who is also a Georgia Cancer Coalition Distinguished Cancer Scholar. “We want to know which genes are abnormally expressed over time and behave differently from the majority of genes, because that would make them likely drug targets.”

Improved predictors of relapse risk could help cancer patients make better treatment decisions in consultation with their physicians, he added. Yuan’s research is supported by the National Science Foundation and the Georgia Cancer Coalition.

Advancing Health-related Readiness

Eva K. Lee, a professor in the School of Industrial and Systems Engineering, specializes in large-scale computational algorithms and systems modeling, with an emphasis on medical-healthcare risk and decision analysis, and logistics management. She is bringing complex modeling, machine learning, and optimization techniques to bear on a number of health informatics projects that involve very large data sets.

“Problems in health systems and biomedicine can often be addressed through systems modeling, algorithm and software design, and decision theory analysis,” said Lee, who is director of the Center for Operations Research in Medicine and HealthCare. “By advancing these tools, we can model very large-scale evolving and heterogeneous data sets to pursue and uncover effective solutions.”

Among her projects:

Software suite for disaster medicine and emergency response – Lee is collaborating with the Centers for Disease Control and Prevention and state and local authorities on a project that uses large-scale informatics techniques to support preparedness for epidemics and other emergencies. The work addresses biological, radiological and chemical emergency incidents as well as natural disasters. It brought Lee to Fukushima, Japan, to study the response to the radiological disaster there.
Strategies for predicting the immunity of vaccines – This project uses novel predictive analytics to mine clinical and biological data, with the aim of predicting the effectiveness of vaccines on different groups of individuals. Under the leadership of Professor Bali Pulendran at Emory University, and in collaboration with researchers at the Dana-Farber Cancer Institute, Duke University, the Institute for Systems Biology, and the National Institutes of Health, Lee is developing a machine learning framework to predict vaccine outcomes based on massive genomic-temporal data.
Personalized target delivery for optimal treatment of cervical cancer – Working with Rush University Medical Center, Lee and her team have been the first to incorporate sophisticated escalated dose delivery technology for cervical cancer patients. These personalized treatment plans incorporate biological information obtained from positron emission tomography that relates cancer cell proliferation with spatial distribution. The treatment object is to optimize tumor control.

To view the full issue of Research Horizons, click here. For more information on big data research at Georgia Tech, please visit: (www.gatech.edu/research/areas/big-data).

John J. Bartholdi, III

Jianjun Shi

Xiaoming Huo

For More Information Contact

Barbara ChristopherIndustrial and Systems Engineering404.385.3102

H. Milton Stewart School of Industrial and Systems Engineering

College of Engineering

Search

For More Information Contact

Georgia Institute of Technology