I admit it - I've a lot to learn when it comes to the internals of big data technologies. Even though I've worked on many data science projects at Goibibo, my knowledge of the principles involved is very limited. So when this project happened, where we needed to build some data pipelines for business needs, I found the perfect opportunity to explore the Hadoop and Hive ecosystem. This blog post is about a new tool that I came across, HPL/SQL.
Years back, the database world was ruled by relational databases, chief amongst them being Oracle. And it had this great language for writing queries - PL/SQL. From the Oracle website,
Big data systems
Years back, the database world was ruled by relational databases, chief amongst them being Oracle. And it had this great language for writing queries - PL/SQL. From the Oracle website,
PL/SQL is a procedural language designed specifically to embrace SQL statements within its syntax. PL/SQL program units are compiled by the Oracle Database server and stored inside the database. And at run-time, both PL/SQL and SQL run within the same server process, bringing optimal efficiency. PL/SQL automatically inherits the robustness, security, and portability of the Oracle Database.However, the explosion in information technology requirements led to a humongous amount of data being available for analysis. This gave rise to two other classes of databases:
Big data is data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them.NOSQL database
NoSQL (originally referring to "non SQL") database provide a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases.
So depending on the problem at hand, engineers now had a host of RDBMS (Oracle, MySQL, Aurora etc), NOSQL (Cassandra, MongoDB, Redis etc), and Big Data systems (Hadoop) to choose from.
This gave rise to a complex challenge - most data analysis earlier could be powered by the same RDBMS alone - example, businesses could build intelligence using stored procedures and triggers in PL/SQL on Oracle. In the world of multiple databases, engineers now have to learn multiple tools for data collection, analysis and so on - example Spark, Mapreduce, Hive and so on.
Different ecosystems and data analysis tools started coming out with their approaches to solve this problem. For example, connecting to multiple databases, custom SQL that is compatible with a covered set of databases and so on. However, the coverage has not been big so far.
Different ecosystems and data analysis tools started coming out with their approaches to solve this problem. For example, connecting to multiple databases, custom SQL that is compatible with a covered set of databases and so on. However, the coverage has not been big so far.
To solve this complex problem, Hive comes with a language/tool called as HPL/SQL. The promise of HPL/SQL is simple - it is a hybrid and heterogeneous language that understands syntax of most procedural SQL dialect, and can be used with any databases, running existing Oracle PL/SQL code on Apache Hive and Microsoft SQL Server, or running Transact-SQL on Oracle, Cloudera Impala or Amazon Redshift. So, you can use it as a language to write new procedural SQL with Hive, and you can use it as a tool to execute existing PL/SQL and logic that you have written for other databases.
Compared to writing code in Python, Java or Scala, HPL/SQL is much more friendly for the BI/SQL expert to run, and allows them to hit the ground running faster. It offers features like
- Control Flow Statements
- Built-in Functions
- Stored Procedures, Functions and Packages
- Exception and Condition Handling
- On-the-fly SQL Conversion
So, if you are exploring an efficient way to implement ETL processes in Hadoop, give HPL/SQL a try. Since Hive 2.0, HPL/SQL comes built in with Hive. All you need to do is,
When executing commands directly:
hplsql -e "SELECT * FROM src LIMIT 1"
When running from a file:
hplsql -f script.sql
You can go through the documentation to get more details if required.
I want to know if it supports inserting data by partition.
ReplyDeleteinsert into tablename partition(partition_name) from select * from table;
but it is error .
Function:not found partition
wish you see that
DeleteGreat Article Artificial Intelligence Projects
DeleteProject Center in Chennai
JavaScript Training in Chennai
JavaScript Training in Chennai
i don`t konw why my name is unknown .
ReplyDeletethis is my email : 10372756845aa@gmail.com
The control, stockpiling, and correspondence of advanced data require capable utilization of calculations. Software engineering encourages the use of these calculations.ExcelR Data Science Courses
ReplyDeleteAwesome blog. I enjoyed reading your articles. This is truly a great read for me. Keep up the good work!
ReplyDeletedata analytics course
data science course
business analytics course
I have to search sites with relevant information on given topic and provide them to teacher our opinion and the article.
ReplyDeleteExcelR data analytics courses
Excellent! I love to post a comment that "The content of your post is awesome" Great work!
ReplyDeletebest data analytics courses in mumbai
Excellent Blog! I would like to thank for the efforts you have made in writing this post. I am hoping the same best work from you in the future as well. I wanted to thank you for this websites! Thanks for sharing. Great websites!
ReplyDeletedata science course
wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries.
ReplyDeleteData Science Course
wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries. keep it up.
ReplyDeletedata analytics course in Bangalore
Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
ReplyDeleteCorrelation vs Covariance
Simple linear regression
Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
ReplyDeleteCorrelation vs Covariance
Simple linear regression
data science interview questions
I am really enjoying reading your well written articles. It looks like you spend a lot of effort and time on your blog. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work.
ReplyDeleteSimple Linear Regression
Correlation vs Covariance
Very interesting blog. Many blogs I see these days do not really provide anything that attracts others, but believe me the way you interact is literally awesome.You can also check my articles as well.
ReplyDeleteData Science In Banglore With Placements
Data Science Course In Bangalore
Data Science Training In Bangalore
Best Data Science Courses In Bangalore
Data Science Institute In Bangalore
Thank you..
I am looking for and I love to post a comment that "The content of your post is awesome" Great work!
ReplyDeleteSimple Linear Regression
Correlation vs Covariance
Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
ReplyDeleteCorrelation vs Covariance
Simple linear regression
data science interview questions
Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.
ReplyDelete360DigiTMG
Attend The Data Analyst Course From ExcelR. Practical Data Analyst Course Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Analyst Course.
ReplyDeleteData Analyst Course
Amazing Article ! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
ReplyDeleteSimple Linear Regression
Correlation vs covariance
data science interview questions
KNN Algorithm
Logistic Regression explained
Very nice blogs!!! i have to learning for lot of information for this sites…Sharing for wonderful information.Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing, data sciecne course in hyderabad
ReplyDeleteVery nice blogs!!! i have to learning for lot of information for this sites…Sharing for wonderful information.Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing, best data science courses in hyderabad
ReplyDelete"Thank you for sharing wonderful information with us.Really useful for everyone data scientist courses
ReplyDelete"
Thank you for sharing wonderful information with us.Really useful for everyonedata scientist courses
ReplyDeletevery well explained .I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
ReplyDeleteSimple Linear Regression
Correlation vs covariance
data science interview questions
KNN Algorithm
Logistic Regression explained
Attend The Business Analytics Courses From ExcelR. Practical Business Analytics Courses Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Analytics Courses. data science course in Hyderabad
ReplyDeleteIf you are interested in data science, you may want to check out its training in data science course in india
ReplyDeleteI've read this post and if I could I desire to suggest you some interesting things or suggestions. Perhaps you could write next articles referring to this article. I want to read more things about it!
ReplyDeleteData Science courses
Aivivu chuyên vé máy bay, tham khảo
ReplyDeletemua ve may bay tu han quoc ve viet nam
giá vé máy bay vinh hồ chí minh
mua vé máy bay đi hà nội
vé máy bay hải phòng nha trang
vé máy bay đi Mỹ giá rẻ 2021
I have bookmarked your website because this site contains valuable information in it. I am really happy with articles quality and presentation. Thanks a lot for keeping great stuff. I am very much thankful for this site.
ReplyDeletedata scientist training and placement