Technological Case Studies
Big Data Case Study – Hadoop Implementation
Organizing and analyzing massive stores of unstructured data can be a daunting challenge. Questions arise of how to manage this data; How much will a solution cost? Where do we store it? How do we efficiently analyze it? Will our relational databases be able to effectively sort and query this data?
The Big Data Challenge
Our client, a leader in the Transportation and Logistics domain, was facing this big data predicament. Combined, their trucks travel roughly 8 million miles per day to deliver their cargo. The Client needed a method to effectively analyze truck travel patterns to gain an understanding on a myriad of issues including how many “empty miles” were accrued on routes and subsequently make adjustments for more efficient deliveries. Utilizing their in-house logistics tracking software, the Client had been temporarily storing log files for analyzing and debugging issues related to the optimizer’s “selection process”. Due to the massive amount of data being pushed into these files, they were only retaining this data for a short duration. Additionally, since the data was unstructured, developers would have to manually extract, parse, and search the data every time they needed to perform an analysis.
Big Data Business Case
A solution was needed to add structure to these data logs, provide the ability to run ad-hoc queries when issues occurred and perform analytics against the data to improve trucking route efficiency. A traditional relational database system would be too resource-intensive due to volume and velocity. The Client needed a big data solution.
- Processing and storage of high volume/velocity data
- Ability to ad-hoc query and run analytics against data
- Sustainable data indexing and organization
- Data visualization for business users
Aptude Consulting Big Data Solution – The Hadoop Implementation
After obtaining information through our discovery and requirements gathering process, we architected a big data solution utilizing Hadoop in conjunction with a combination of other key open-source components to harness its full potential. In doing so, we created the MapReduced architecture illustrated below.
Our solution pre-processes and prepares the data to be consumed, creating a “solution” and “problem” file. These files are then aggregated and distributed: log files where sent to Solr for indexing and “solution” data to HDFS. Data is then passed into a sink to process and load it into a Hadoop component, which is then distributed to Solr Cloud and HDFS, respectively. The end result is structured data availability in multiple formats, with flexibility for low latency queries provided through Cloudera Impala and data visualization with OBIEE connectivity.
With minimal hardware resources and a collection of open-source software requiring no licensing fees, we realized the Client’s big data solution at a fraction of the cost a traditional relational database solution would have required. The Hadoop implementation resulted in cost and time savings, with an additional benefit from the boost in productivity they will achieve with their new analytical assets.
Could your organization benefit from an open source big data solution with Hadoop? If you handle large data sets that require analytical insights, then look no further. Aptude brings 14 years of IT consulting to the table, with an expertise in big data implementations and both Microsoft and Oracle business intelligence solutions.
Gain Time, Increase Currency, Contact Us
It’s amazing how one quick email can change your life. Give us a shout! We’ll get back to you right away with the right person for what you’re looking to accomplish.
What our clients are saying…
Aptude provides onsite and offshore Oracle DBA support, which includes troubleshooting, back-up, recovery, migration, upgrades, and daily maintenance of Oracle database servers. Aptude has been working with our team for the past four years and we continue to use them and are satisfied with their work
Aptude provided Build.com a Java, MySQL, Webservices and other UI based solution in the business domain of analyzing and reporting on user activities for our ecommerce website. Utilizing Omniture’s APIs to download, parse, and regenerate and upload back so that we could be more effective in our marketing. I was satisfied with their project work and delivery and would consider utilizing them for future projects.” Build.com
Aptude provided us with Oracle DBA migration support, including an upgrade from Oracle 11.1 to Oracle 11.2, and the project was completed on time and to specifications. The project manager and project consultants were responsive and proactive, resulting in a successful conclusion to the work. I would definitely contract with them again, and have recommended them to other technical offices at the University of Georgia.
Thank you for the hard work your team has put forth to staff the contract positions at Wolters Kluwer. Aptude has consistently scored high in our supplier carding and even more important you are a vendor we can always trust. I am especially impressed with your ability to tackle our positions that other vendors have not been able to fill.