About the client
The client is one of the world’s leading providers of market and consumer research, which enables its customers to make smarter business decisions.
The client was facing the following issues:
- The client’s existing legacy platform based on the COTS analytics software was incurring high data processing costs. It also required manual intervention in multiple instances.
- The client was incurring high customization costs for integration of new data sources.
- The existing solution was unable to handle data synchronization (multiple source streams) and data maintainability was a problem.
Cybage’s solution comprised the following:
- Business Intelligence (BI) solution on Big Data platform, capitalizing on Hadoop technologies, deployed using Atlassian Bamboo to support multiple data partitions with multiple periods using nested ETL scheduled workflows
- Pluggable solution that enabled easy integration with new data sources.
- Custom-made automation framework based on Oozie JMS notification system aided quality control.
- Solution transforms the legacy data set into a common big data file system, create an inventory of all data sets, apply data science modules for imputation, fusion and weighting, and manages dependencies between processing steps and automate job execution.
- Maintenance and utilization of versioned data to ensure data accuracy; for instance, storage and retrieval of data in versions.
- Creation of cohesive components that could be rapidly plugged into the system for workflows to execute seamlessly.
- Provision of rich reports on aggregated data with additional product filters.
Cybage adhered to the following plan to provide effective results to the client:
- Adhered to SCRUM and Kanban methodologies to accelerate new adoptions, help overcome common pitfalls and challenges, and to evolve a new framework that could sustain value-delivery and agility over time.
- Utilized development tools such as SonarQube for static code analysis and code coverage to keep technical debt under control and manage code quality.
The solution provided the following benefits to the client:
- Using this platform, the client was able to offer comprehensive insights harmonized with local market intelligence, enabling its customers to make informed decisions.
- The platform offered low data processing cost with high scalability and greater flexibility to adopt new data sources.
Tools and technologies
Cybage used the following tools and technologies:
Development Java, MapReduce, Pig, Hive, Oozie, Kite Dataset Management, HCatalog, Impala, Graphite, EXASOL
Testing JUnit, PigUnit, hcatunit, kite-unit, QA automation tool
CI Atlassian Bamboo
Tools Git, Vagrant, SourceTree, Cloudera Manager
ALM Integrated Atlassian tools—Confluence, JIRA, Stash, Lync, SonarQube
Cybage services utilized
Development, Testing, Test Automation, Cloud and BigData, BI, ALM, DevOps Capabilities