Machine Learning in Google Cloud Big Query using SQL

International Journal of Computer Science and Engineering
© 2023 by SSRG - IJCSE Journal
Volume 10 Issue 5
Year of Publication : 2023
Authors : Ravi Kashyap

pdf
How to Cite?

Ravi Kashyap, "Machine Learning in Google Cloud Big Query using SQL," SSRG International Journal of Computer Science and Engineering , vol. 10,  no. 5, pp. 17-25, 2023. Crossref, https://doi.org/10.14445/23488387/IJCSE-V10I5P103

Abstract:

In today's world, data has become a valuable resource for businesses, governments, researchers, and individuals alike. However, to truly extract value from data, it is essential to provide the proper context. Simply collecting and analyzing data without understanding its context can lead to inaccurate conclusions and misguided decision-making. An important factor that drives a successful organization is gathering data that can be analyzed to gain greater insights into the business and enable new opportunities, allowing the business to innovate products/services based on consumer preference. Data is the lifeblood of all businesses, and data-driven decisions can make a significant difference in staying ahead of the competition. Machine learning can be the key to unlocking the value of corporate and customer data, enabling businesses to leverage their data to make more accurate predictions and decisions. It can help businesses to identify patterns and trends in their data that might not be apparent to humans, leading to more accurate predictions and better decisions.
Challenge: Machine learning (ML) requires trained professionals and data scientists having good knowledge of programming languages like Python or R. One of the big challenges for data analytics or business people with an SQL background is that it does not matter how good their domain knowledge is, they still cannot contribute much technically when it comes to ML initiatives in their organization. On the other hand, data scientists have good knowledge of statistics, algorithms, Python or R, but they may not have specific data or domain knowledge which can lead to inaccurate predictions and flawed decisions.
This paper aims to enable data analysts and data scientists to build and deploy machine learning models on massive datasets using SQL queries without the need for extensive knowledge in programming or data preprocessing.

Keywords:

Data warehousing, BigQuery, Machine learning, Data analysis, Predictive modeling, SQL, Data preprocessing, Model selection, Model evaluation, Cloud computing, Artificial intelligence.

References:

[1] What is BigQuery ML? [Online]. Available: https://cloud.google.com/bigquery/docs/bqml-introduction
[2] BigQuery Pricing. [Online]. Available: https://cloud.google.com/bigquery/pricing#queries
[3] Hyperparameter Tuning for CREATE MODEL Statement. [Online]. Available:
https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-hyperparameter-tuning#hyperparameters_and_objectives
[4] Sikender Mohsienuddin Mohammad, “Cloud Computing in IT and How It’s Going to Help United States Specifically,” International Journal of Computer Trends and Technology (IJCTT), vol. 67, no. 10, 2019.
[Google Scholar] [Publisher Link]
[5] Haleem Khan, and Yu Jiong, “Cloud Computing Effect on Enterprises in Terms of Cost,” International Journal of Computer Trends and Technology (IJCTT), vol. 67, no. 5, pp. 14-19, 2019.
[CrossRef] [Publisher Link]
[6] Etene Yonah, and Josphat M. Karani, “Performance in Layered Software Architectures: The Case of Customized Organizational Software,” International Journal of Computer Trends and Technology (IJCTT), vol. 67, no. 12, pp. 15-19, 2019.
[CrossRef] [Publisher Link]
[7] G. Anitha et al., “A Survey of Security Issues in IIOT and Fault Identification using Predictive Analysis in Industry 4.0,” International Journal of Engineering Trends and Technology, vol. 70, no. 12, pp. 99-108, 2022.
[CrossRef] [Publisher Link]
[8] F. Twinkle Graf, and P. Prema, “Secure Collaborative Privacy in Cloud Data with Advanced Symmetric Key Block Algorithm,” SSRG International Journal of Computer Science and Engineering, vol. 2, no. 2, pp. 40-44, 2015.
[CrossRef] [Publisher Link]
[9] Surendra Kumar Reddy Koduru, “Prediction of Severity of an Accident Based on the Extent of Injury using Machine Learning,” International Journal of Computer Trends and Technology, vol. 70, no. 9, pp. 43-49, 2022.
[CrossRef] [Publisher Link]
[10] Satyanarayan Raju Vadapalli, “Monitoring the Performance of Machine Learning Models in Production,” International Journal of Computer Trends and Technology, vol. 70, no. 9, pp. 38-42, 2022.
[CrossRef] [Publisher Link]
[11] Ravi Kashyap, “Data Sharing, Disaster Management, and Security Capabilities of Snowflake a Cloud Datawarehouse,” International Journal of Computer Trends and Technology, vol. 71, no. 2, pp. 78-86, 2023.
[CrossRef] [Publisher Link]