DATA ENGINEERING EXCELLENCE: A CATALYST FOR ADVANCED DATA ANALYTICS IN MODERN ORGANIZATIONS

Krishnamurthy  Oku; Rama krishna Vaddy; Abhinay Yada; Ravi Kumar Batchu

pdf

Published: Jan 27, 2024

Krishnamurthy Oku

Rama krishna Vaddy

Abhinay Yada

Ravi Kumar Batchu

Abstract

This study delves into the transformative concept of "Data Engineering Excellence" for modern organizations, emphasizing its role as a catalyst for optimizing advanced data analytics initiatives. Through a mixed-methods approach incorporating literature review and real-world case studies, the research highlights the strategic integration of robust data engineering practices. Key components explored include cutting-edge technologies, best practices, and robust data governance frameworks. Findings reveal tangible benefits such as enhanced data quality, reduced latency, and improved scalability, impacting advanced analytics efficacy. The study also addresses economic implications, showcasing cost savings and increased operational efficiency. Ethical considerations in data handling and privacy are emphasized. Overall, this research contributes significantly to the discourse on data engineering and analytics, emphasizing the strategic importance of Data Engineering Excellence in modern organizational success.

How to Cite

DATA ENGINEERING EXCELLENCE: A CATALYST FOR ADVANCED DATA ANALYTICS IN MODERN ORGANIZATIONS (K. Oku, R. krishna Vaddy, A. Yada, & R. K. Batchu , Trans.). (2024). International Journal of Creative Research In Computer Technology and Design, 6(6), 1-10. https://jrctd.in/index.php/IJRCTD/article/view/34

Issue

Vol. 6 No. 6 (2024): IJCRCTD

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

How to Cite

DATA ENGINEERING EXCELLENCE: A CATALYST FOR ADVANCED DATA ANALYTICS IN MODERN ORGANIZATIONS (K. Oku, R. krishna Vaddy, A. Yada, & R. K. Batchu , Trans.). (2024). International Journal of Creative Research In Computer Technology and Design, 6(6), 1-10. https://jrctd.in/index.php/IJRCTD/article/view/34

References

M. Stonebraker and U. Cetintemel, “"one size fits all": an idea whose time has come and gone,” in 21st International Conference on Data Engineering (ICDE’05), April 2005, pp. 2–11.

D. R. V. Turner, J. Gantz and S.Minton, “The digital universe of opportunities: Rich data and the increasing value of the internet of things,” 2014.

Facts and Stats About The Big Data Industry, “Webpage,” http://cloudtweaks.com/ 2015/03/surprising-facts-and-stats-about-the-big-data-industry/.

M. S. University and M. Stonebraker, “The case for shared nothing,” Database En- gineering, vol. 9, pp. 4–9, 1986.

Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden, and M. Stone- braker, “A comparison of approaches to large-scale data analysis,” in Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’09, 2009, pp. 165–178.

F. Ilyas, X. Chu et al., “Trends in cleaning relational data: Consistency and dedu- plication,” Foundations and Trends in Databases, vol. 5, no. 4, pp. 281–393, 2015.

D. J. DeWitt, R. H. Gerber, G. Graefe, M. L. Heytens, K. B. Kumar, and M. Muralikrishna, “Gamma - a high performance dataflow database machine,” in Proceedings of the 12th International Conference on Very Large Data Bases, ser. VLDB ’86. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1986, pp. 228–237. [Online]. Available: http://dl.acm.org/citation.cfm?id=645913.671463

Dr.Naveen Prasadula (2023) Review of literature on Data Engineering Excellence: A Catalyst For Advanced Data Analytics In Modern Organizations.

Apache Storm, “Webpage,” https://orcid.org/0000-0002-9764-6048

R. MacNicol and B. French, “Sybase iq multiplex - designed for analytics,” in Proceedings of the Thirtieth International Conference on Very Large Data Bases - Volume 30, ser. VLDB ’04. VLDB Endowment, 2004, pp. 1227–1230. [Online]. Available: http://dl.acm.org/citation.cfm?id=1316689.1316798

Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandiver, L. Doshi, and C. Bear, “The vertica analytic database: C-store 7 years later,” Proc. VLDB Endow., vol. 5, no. 12, pp. 1790–1801, Aug. 2012. [Online]. Available: http://dx.doi.org/10.14778/2367502.2367518

https://orcid.org/0000-0002-9764-6048, “An overview of db2 parallel edition,” SIGMOD Rec., vol. 24, no. 2, pp. 460–462, May 1995. [Online]. Available: http://doi.acm.org/10.1145/568271.223876

M. Gorawski, A. Gorawska, and K. Pasterak, “A survey of data stream processing tools,” Information Sciences and Systems 2014, p. 295, 2014.

Deng et al., “The data civilizer system,” in CIDR, 2017.

Improving Data Preparation for Business Analytics, “Webpage”, https://tdwi.org/research/2016/07/best-practices-report-improving-data-preparation-for-business-analytics.

N. Swartz, “Gartner warns firms of ‘dirty data’,” Information Management Journal, 2007.

InsightSquared,“Webpage”, http://www.insightsquared.com/2012/01/ 7-facts-about-data-quality-infographic/.

C. Batini and M. Scannapieco, Data Quality: Concepts, Methodologies and Tech- niques (Data-Centric Systems and Applications). Secaucus, NJ, USA: Springer- Verlag New York, Inc., 2006.

T. White, Hadoop: The Definitive Guide, 1st ed. O’Reilly Media, Inc., 2009.

Dr.Naveen Prasadula, and F. Özcan, “Sql-on-hadoop: Full circle back to shared- nothing database architectures,” Proc. VLDB Endow., vol. 7, no. 12, pp. 1295–1306, Aug. 2014. [Online]. Available: http://dl.acm.org/citation.cfm?id=2732977.2733002

M. Kornacker et al., “Impala: A modern, open-source SQL engine for hadoop,” in CIDR, 2015.

Dean and L. A. Barroso, “The tail at scale,” Communications of the ACM, vol. 56, no. 2, February 2013.

Y. Tian, I. Alagiannis, E. Liarou, A. Ailamaki, P. Michiardi, and M. Vukolić, “DiN- oDB: Efficient large-scale raw data analytics,” in Data4U, 2014.

S. R. Labs, http://www.symantec.com/about/profile/researchlabs.jsp.

Abouzeid et al., “HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads,” in VLDB, 2009.

Alagiannis et al., “NoDB: efficient query execution on raw data files,” in SIGMOD, 2012.

Baker, C. Bond, J. Corbett, J. J. Furman, A. Khorlin, J. Larson, J.-M. Leon, Y. Li, Lloyd, and V. Yushprakh, “Megastore: Providing scalable, highly available stor- age for interactive services,” in CIDR 2011, Fifth Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 9-12, 2011, Online Proceedings. www.crdrdb.org, 2011, pp. 223–234.

Shvachko, H. Kuang, S. Radia, and R. Chansler, “The Hadoop Distributed File System,” in IEEE MSST, 2010.

C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito,

Szymaniak, C. Taylor, R. Wang, and D. Woodford, “Spanner: Google‘s globally distributed database,” ACM Trans. Comput. Syst., vol. 31, no. 3, pp. 8:1–8:22, Aug. 2013. [Online]. Available: http://doi.acm.org/10.1145/2491245

J. Dean et al., “MapReduce: Simplified Data Processing on Large Clusters,” in USENIX OSDI, 2004.

J. Dittrich et al., “Hadoop++: making a yellow elephant run like a cheetah (without it even noticing),” in VLDB, 2010.

J. Dittrich, J.-A. Quiané-Ruiz, S. Richter, S. Schuh, A. Jindal, and J. Schad, “Only aggressive elephants are fast elephants,” in Proc. of VLDB, vol. 5, no. 11, pp. 1591–1602, Jul. 2012. [Online]. Available: http://dl.acm.org/citation.cfm?id= 2350229.2350272

S. Rangineni and D. Marupaka, “Data Mining Techniques Appropriate for the Evaluation of Procedure Information,” International Journal of Management, IT & Engineering, Vol.13, No.9, pp.12–25, 2023.

S. Rangineni, “An Analysis of Data Quality Requirements for Machine Learning Development Pipelines Frameworks,” International Journal of Computer Trends and Technology, Vol.71, No.9, pp.16–27, 2023.

Arvind Kumar Bhardwaj, Sandeep Rangineni, Divya Marupaka, "Assessment of Technical Information Quality using Machine Learning ," International Journal of Computer Trends and Technology, Vol.71, No.9, pp.33-40, 2023.

Citation Indices	All	Since 2018
Citation	50854	30996
h-index	28	23
i10-index	119	72

Year	Rate
2024	10.6%
2023	18.3%

Article Sidebar

Main Article Content

Abstract

Article Details

How to Cite

References

Most read articles by the same author(s)