2020-06-01

Oracle WINDOW FUNCTION XOR SDO_GEOMETRY

Recently a friend asked me how to write a query for a SQL query which was slightly above simple SQL demo cases you can find all around. For me it was a good reason to brush my SQL (and in the process I had to reach out to Kim Berg Hansen for a beautiful match_return solution, but that's not this posts content). As this friend has his data in a postgres database, I managed to get a free DB at Alwaysdata and play with this. 

But as I was asking Kim for some help, it was obvious to work in Oracle world. I supposed most solutions should be able to transfer to Posgres easily. How wrong I was ...

A striped down testcase uses a table like 
create table gps (
   train          integer
 , gpstimestamp   timestamp with time zone
 , geom           sdo_geometry
);
Train describes a series of measurements: every second a GPS point is measured. On part of the problematic question is to find the distance between 2 consecutive points. As I didn't stop in 1992, the LAG function came to my mind. 

But in Kims testcase, LAG did not work at all: 
select
train, gpstimestamp, geom
  , nvl(sdo_geom.sdo_distance(geom, LAG(geom) 
                                      OVER (PARTITION BY train 
                                            ORDER BY  gpstimestamp), 0.005)
      , 0) as prev_dist
from gps
;
throws 
ORA-22901: cannot compare VARRAY or LOB attributes of an object type 
This error just does not makes any sense, as the geomertry does not need to be sorted at all. gpstimestamp is used for sorting, and oracle is quite capable of sorting timestamps of any kind. 
Maybe train needs to be sorted internally - PARTITION BY in theory just needs a comparison for equality, but Oracle often sorts in such cases. Even here train is of type integer - nothing easier to sort than this. 

But Kim knows SQL, and if one solution fails for any obscure reason, there is another one: 

   select
      train, gpstimestamp, geom, prev_dist
   from gps
   match_recognize (
      partition by train
      order by gpstimestamp
      measures
         nvl(sdo_geom.sdo_distance(geom, prev(geom), 0.005), 0) as prev_dist
      all rows per match
      pattern (any_row)
      define
         any_row as 1=1
   )
A beautiful workaround. - Funnily in livesql, even this workaround fails, this time with ORA-932:
ORA-00932: inconsistent datatypes: expected an IN argument at position 1 that is an instance of an Oracle type convertible to an instance of a user defined Java class got an Oracle type that could not be converted to a java class

Of course there are other solutions also: scalar subqueries, correlating inline views and many others. 

I also opened SR 3-23171402921 at Oracle - with the result it works as designed / expected. 



OK, back to my initial task: helping my friend with his query.

First I tried to port Kims MATCH_RECOGNIZE solution, but Postgres does not understand MATCH_RECOGNIZE yet. 
So I gave the LAG syntax a chance - and it worked: 
                lag(geom) over
   ( PARTITION BY train
     ORDER BY gpstimestamp ) AS prev_geom ,
is flawless accepted by Postgres!

There are several lessons learned: 
  • It's not as easy as I hoped to transfer SQL between DB-engines.
  • If my SQL seems to be over complicated, for sure it is - ask Kim ;-) 
  • If you want to do analytical functions with spatial data, use Postgres, not Oracle. 

1 Kommentar:

iudith hat gesagt…

Hello Martin,

Following your LinkedIn post:

I checked and found that, more generally, for using analytic functions with an object type
argument, the object type is (erroneously) required to have a MAP or OBJECT method defined.

For example, if you define your own object type as follows:

CREATE OR REPLACE TYPE my_sdo_geometry AS OBJECT (
the_sdo_geometry sdo_geometry,

MAP MEMBER FUNCTION mymap RETURN NUMBER
)
/

Type created.


CREATE OR REPLACE TYPE BODY my_sdo_geometry AS
MAP MEMBER FUNCTION mymap RETURN NUMBER
IS
BEGIN
RETURN the_sdo_geometry.sdo_gtype;
END;
END;
/

Type created.


then the following SELECT will work without error, though the MAP method is not effectively needed in this case:

select
train, gpstimestamp, my_sdo_geometry(geom)
, LAG(my_sdo_geometry(geom))
OVER (PARTITION BY train
ORDER BY gpstimestamp) prev_geom
from gps
/


Cheers & Best Regards,
Iudith Mentzel