Mammoth
  • |
  • Contact
  • |
CMD | Command Prompt, Inc. - PostgreSQL Solutions, Support & Hosting
  • |
  • |
  • |
  • |
  • |
Migrating hierarchical queries from Oracle to PostgreSQL
Posted Thursday Apr 26th, 2012 10:55am
by Alexey Klyukin
| Permalink

This is the second part in a series of blog posts describing PostgreSQL analogs of common Oracle queries



One of the most intricate Oracle specific constructions is "START WITH ... CONNECT BY". According to Oracle's documentation, the syntax is:
SELECT [query] [START WITH initial_condition] CONNECT BY [nocycle] condition.
This statement is commonly used to traverse hierarchical data in the parent-child order. It's easier to illustrate how it works with an example.



Consider a table that stores opponents moves in a game of chess. Each table row contain coordinates (in algebraic notation) of a single move by whites and the move in response by blacks, as well as a column that references a preceding move, making it possible to keep multiple continuations of a specific move for the post-game analysis.



CREATE TABLE moves(id integer, parent integer, white varchar(10), 
black varchar(10));


The following statements describe 2 variants of a very short game, the first one leading to the early checkmate (known as a scholar's mate), and the second one to the position where black successfully avoids being checkmated.


INSERT INTO moves VALUES(1, 0, 'e4', 'e5');
INSERT INTO moves VALUES(2, 1, 'Qh5', 'Nc6');
INSERT INTO moves VALUES(3, 2, 'Bc4', 'g6');
INSERT INTO moves VALUES(4, 3, 'Qf3', 'Nf6'); -- checkmate is avoided
INSERT INTO moves VALUES(5, 2, 'Bc4', 'Nf6');
INSERT INTO moves VALUES(6, 5, 'Qxf7#', NULL); -- blacks being checkmated

Let's build an Oracle query showing a sequence of moves that leads to the checkmate:


SELECT DISTINCT id AS final_move_id, 
LTRIM(SYS_CONNECT_BY_PATH(NVL(white,'')||':'||NVL(black,''),';'),';')||';' 
AS moves, LEVEL AS mate_in 
FROM moves WHERE white LIKE '%#' OR black LIKE '%#' START WITH id = 1 
CONNECT BY PRIOR id = parent;















FINAL_MOVE_ID

MOVES

MATE_IN

6

e4:e5;Qh5:Nc6;Bc4:Nf6;Qxf7#:;

4



The query instructs Oracle to look for a checkmate:



  • The search starts at the move with id = 1, as indicated in the START WITH clause, and considers all possible continuations that lead to a checkmate, denoted by the final '#' in the move's description.
  • Each move and its direct continuation, for instance, moves 2 and 5 represent the parent-child relationship, described by the PRIOR condition.
  • The search depth is stored in the LEVEL pseudo-column.

As a result, Oracle goes from one row to another only if the parent column of the new row contains the id of the current row, accumulating all visited rows in a result set. The SYS_CONNECT_BY_PATH clause produces a string out of the specified columns of the visited rows, connecting each (parent, child) pair by the designated character (';' in our case).



Being Oracle SQL extension, CONNECT BY is not available in PostgreSQL. Recent versions of PostgreSQL implement Common Table Expressions (CTE), SQL-standard way of dealing with hierarchical data. Here's one possible rewrite of the query above for PostgreSQL using recursive CTEs:
Read more...


Categories: OpenSource, PostgreSQL, SQL

Cool and Sexy: Open Source PostgreSQL enterprise contenders
Posted Friday Apr 20th, 2012 10:43am
by Joshua Drake
| Permalink

As with any healthy project, there will be offshoots and people will take the source, fork it and try to create something new, better, different or just.... How that person feels it should be. This is a good thing, it leads to new ideas, new communities and sometimes truly interesting pieces of software.


Postgres-XC has been around for a while, it is primarily developed by NTT and EnterpriseDB. It has a small community but a dedicated engineering/hacker backing. Postgres-XC is interesting because it keeps reasonably up to date with the latest Postgres (1.0 is set to be based on 9.1 of PostgreSQL) but provides a shared nothing clustering architecture. This type of infrastructure is one of the holy grails of web based applications.


Should Postgres-XC deliver on its promises (hint: it does), you will be able to scale out (as opposed to up which Postgres already does extremely well) at an almost 1 to 1 ratio. This means that instead of having to purchase 2 large machines at 10-12k a piece you could purchase 4 machines at 1.5k a piece and achieve similar performance (theoretically, I need to test this). It also means that scaling out in the "cloud" will be easier.


I invite everyone interested in PostgreSQL to take a look at Postgres-XC. It is going to 1.0 soon and it needs community members to help flesh out the warts that haven't been found yet.


Another Postgres fork that has recently appeared is tPostgres. tPostgres (doesn't that look wrong at the beginning of a sentence?) is set to do to Microsoft SQL what EnterpriseDB did to Oracle, with one minor, small, interesting, exception: tPostgres is Open Source. Further Microsoft SQL is more in line with PostgreSQL in the types of workloads you usually see it performing. Imagine a tPostgres with Postgres-XC. Imagine an open source way to easily port Microsoft SQL apps to PostgreSQL.


Now don't get me wrong, the latest versions of Microsoft SQL are actually good products. Yes, I did just say that. However, they are not Open Source, they are expensive (comparatively) and let's get real, we want everyone to run Postgres.


Unfortunately tPostgres is only just announced and they are literally at the beginning of building their community but as it is being initiated by Denis Lussier (co-founder of EnterpriseDB), I imagine that he will come through with something very interesting indeed.


Categories: Business, OpenSource, PostgreSQL, Python, SQL

PgNext: Cancelled
Posted Wednesday Apr 18th, 2012 11:43am
by Joshua Drake
| Permalink

It is with regret that I announce that PgNext is cancelled. I am not sure what is next for the PostgreSQL Conference series. The reasons are long and myriad and I will not bore you with them. However I will present the following video:




If you can't see the video, here is the video link.


That video represents why I would put on the conferences. They were fun. We had a good time.


If you are looking for other Postgres conferences there are the following:




Personally, I would suggest staying local and attending or help organize a local PUG day for PostgreSQL. PUG days are the best in small conferences. You are meeting with many locals, quite a few contributors usually show up, and you get to go home at night. The content is always top notch and chances are you know many of the people there. There are many. We recently had them in NYC, DC/Maryland, and Austin. There is a Denver PgDay on the 26th of October (no website yet) as well.


Categories: Business, OpenSource, PostgreSQL, Python, SQL

Remembering our roots
Posted Tuesday Apr 10th, 2012 11:22am
by Joshua Drake
| Permalink

Once upon a time, JD was a assistant manager for Block Buster video. This was a very long time ago and before a 23 month employment stint at Powells Books. It was at Powells that the world of computers was actually introduced to me as a viable employment option. While there I designed a special order database in DBase IV, was introduced to University Ingres, went through Book Buyer training, became a Novell Netware Administrator, and began a side business selling pre-built computers and parts. I also pretended to go to college and generally just had zero clue about life. I still don't have much of a clue about life.


Why does this matter? It doesn't really. I am just rambling because my sister asked me today something that surprised me, "What is UNIX?". I had to just kind of stare at the screen for a moment. Of course she asked me this as she was happily proclaiming that she received an iPhone for her birthday. How far we have come.


I explained what UNIX was, the basic history, it's involvement in the Internet and it occurred to me that for me, there was one very specific point in life that my professional world went from, "huh.... give me my 7.50/hr" to, "Hey, I can actually become educated in something useful.". It was the mental absorption of this book.


That book, allowed me to learn Unix, which allowed me to learn Linux (back when SLS was king), which brought me to Postgres95, which brought me to PostgreSQL, which brought me to co-writing this book, which lead me to be a major contributor to PostgreSQL not only through my work with the Fundraising group (via SPI)but also . I would also bring up the conferences but those are already mentioned today.


While waxing nostalgia I am reminded of a
recent blog post by Bruce Momjian where he mentions, "Postgres adoption is probably five years behind Linux's adoption.". I would agree with him, and would add that a lot of it is directly contributed to our development model. Many in the community have argued for years that time based releases of PostgreSQL would help development, many others... have argued for years that this is a bad idea. Many of those opponents of time based releasing, and one very influential one at that (TGL) are now starting to come around. More on that later, I have work to do!



Categories: Business, OpenSource, PostgreSQL, Python, SQL

PgNext (PostgreSQL Conference) CFP is still open
Posted Tuesday Apr 10th, 2012 10:51am
by Joshua Drake
| Permalink

As a reminder, the CFP for PgNext is still open. We are in Denver this year, let's make it rock! This year we are keeping it simple and getting back to roots. The conference is about community, networking with professionals, learning and in general having a good time. Who can't have a good time in Denver?

Categories: Business, OpenSource, PostgreSQL, SQL

URI connection strings, PgNext CFP and other generalities (FKlocks)
Posted Thursday Mar 22nd, 2012 11:45am
by Joshua Drake
| Permalink

Our team has been hard at work on several things. One is the URI patch for libpq which was just committed and sponsored by Heroku (Thanks Heroku). This is a novel patch that brings standard URI connection handling to libpq and any client/driver that decides to implement the functionality. You can see the patch here.


We are still actively working on PgNext: The Next PostgreSQL Conference. The folks on the organizing team have been an invaluable resource at helping us determine the direction of the conference. We have also been receiving a lot of emails thanking us for the selection of Denver as the location, many of them from new attendees. If you haven't submitted a talk yet, now is the time!


The FKLocks patch was unfortunately pushed to 9.3 due to some outstanding issues not the least of which was a performance regression under normal FK use. This is a large patch that team member Alvaro Herrera has been working on for a very long time. It is a patch that has the potential to greatly increase the performance of foreign keys. It has been a lesson in patience, evaluation of sponsored work (it was partially, and only partially sponsored), and resource allocation. Hopefully we can be done with this soon.



Categories: Business, OpenSource, PostgreSQL, SQL

Pearls of Oracle to PostgreSQL conversion
Posted Friday Mar 9th, 2012 03:30am
by Alexey Klyukin
| Permalink


We have been working on a large Oracle 8i conversion to PostgreSQL. Our customers were not concerned with the data conversion: there are tools like ora2pg and oracle foreign data wrapper to accomplish this. They do, however, have a significant number of queries that needs to be converted.



Apparently, most queries from Oracle and PostgreSQL look similar; after all, both are relational database systems, as opposed to Cassandra or MongoDB, seeking to adhere to the same standards. Unfortunately, Oracle is known for the non-standard syntax extensions that are widely used by DB developers. Fortunately, the PostgreSQL community went great lengths not only to rely on SQL standards, but to use them as a source for the new powerful features.



How do recent versions of PostgreSQL stack up against Oracle syntax extensions, you may ask. The answer is very well; in fact, there is hardly anything in Oracle 8i syntax that cannot be emulated by PostgreSQL. Let's consider a couple of examples, starting from the one of the most peculiar syntax construction Oracle is known for, outer joins:



Oracle versions before 9 used a non-standard syntax for left and right outer joins; support for full outer joins was missing altogether. Suppose we have a schema to track orders:


CREATE TABLE users(id INTEGER, email VARCHAR2(200));
CREATE TABLE products(id INTEGER, name VARCHAR2(200), price FLOAT(126));
CREATE TABLE orders(id INTEGER, userid INTEGER, productid INTEGER, 
                    quantity INTEGER);


A typical LEFT JOIN, returning all users that haven't made any orders, looks like this:


SELECT u.id FROM users u, orders o
WHERE o.userid IS NULL AND u.id = o.userid (+);


Oracle's OUTER JOIN is represented by a special WHERE clause with a '+' sign added. The '+' denotes the nullable side of the join: we are looking for users that don't have any rows associated with their ids in the orders table. The PostgreSQL equivalent of this query will be a straightforward:


SELECT u.id FROM users u LEFT JOIN orders.o ON (u.id = o.userid) 
WHERE o.userid IS NULL;


Clearly, the rule is simple: if you see a '(+)' in the Oracle's WHERE clause - it's really an outer join; if a '+' sign is on the left side of the binary operator - it's a RIGHT JOIN and vice versa.



Let's consider an Oracle construction that is a little more complex to emulate, Oracle's rownum column. This is a special pseudo-column that numbers rows returned by an Oracle query. For instance, to get positions of all products in the product list ordered by price, one can issue the following Oracle query:


SELECT rownum as position, id, name, price 
FROM (SELECT * FROM products ORDER BY price);


If 'products' stores data in no particular order, the rownum value would be different from the product id.



Unlike Oracle, PostgreSQL doesn't have a rownum column, so what would we do with the query above? Turns out, we can emulate it with the help of window functions, a feature available since PostgreSQL 8.4. These functions are capable of performing calculations over related group of rows called partitions. In our example, we need to apply the function called row_number() over the whole result set (denoted by the empty PARTITION BY clause), using ordering by price to compute the row number:


SELECT row_number() OVER (ORDER BY price) as position, id, name, price 
FROM products ORDER BY price;


Note that there is a subtle problem in the queries above. Different positions will be assigned to the items with equal price points. While there is no easy way to fix this in Oracle 8i, we can assign equal positions to the items with no price differences by switching the window function from row_number() to rank() in the PostgreSQL case:

SELECT rank() OVER (ORDER BY price) as position, id, name, price 
FROM products ORDER BY price


There is another common use case of Oracle's rownum column, to limit the number of rows returned from a query like this (used to extract the 5 most expensive products):


SELECT * FROM (SELECT * from products ORDER BY price DESC) WHERE rownum <= 5;


We need neither window functions, nor subqueries in the PostgreSQL equivalent of this query:


SELECT * FROM products ORDER BY price DESC LIMIT 5;	


Easy, right?



You are probably wondering why am I comparing the recent versions of PostgreSQL to Oracle 8i, being almost 15 years old now? The reason is, there are a number of customers still running such an old products not ready to shell out hundreds of thousands for an upgrade to a new Oracle major version. PostgreSQL might be the best and cost-effective way to get their data to a modern relational database system and, of course, upgrades to new major versions are free (and can be performed in-place with pg_upgrade).



There are more examples I'd like to demonstrate (including the conversion of Oracle's CONNECT BY clause), but this post is already getting too long, so I'm wrapping up for now. Stay tuned for further posts!



Categories: PostgreSQL, SQL

Another day, another recovery
Posted Tuesday Feb 14th, 2012 03:01pm
by Alvaro Herrera
| Permalink

This is something I have seen many times now: a customer calls us because they
lost some data and they want help recovering.



Now you must be wondering: surely if they lost data they can just recover from
their last backup, right? Right — they had that. However we know that
pg_dump takes a while to run and is stressful on the server, so it's normally
run just once a day or so. What happens if you've been running almost a full
work day since your last backup? It's a lot of data to lose.



Read more...

Categories: PostgreSQL, SQL

Decoding infomasks
Posted Wednesday Nov 30th, 2011 01:20pm
by Alvaro Herrera
| Permalink

Come on, admit it: you've always wanted to display the infomask bits from a tuple header in a human-readable manner, but you've never gotten around to it and you still keep htup.h in display while you peek around tuples.



Fortunately, that time is now past! Here's a short and simple recipe to decode the bits for your reading pleasure. Gone is the htup.h cheat sheet. Here's what you need:


Read more...

Categories: PostgreSQL, SQL

PgWest 2011: Only a week away
Posted Tuesday Sep 20th, 2011 10:04am
by Joshua Drake
| Permalink

PgWest is only a week a way folks, let's get those registrations in!

Categories: Business, OpenSource, PostgreSQL, Python, SQL


Copyright © 2000-2012 Command Prompt, Inc. All Rights Reserved. All trademarks property of their respective owners.