What's new

Welcome to the GSA Community!

This is the Official User Community for GE's GeoSpatial Analysis products. By registering with us, you'll be able to discuss, share and private message with other members of our Community.
Registration is currently only allowed with a personalized invitation code, which you might have received previously. If you haven't got one:
please get in touch with your GE contact person or your regional moderator(s): see FAQ.

SignUp Now!

how to eleminate duplicate results

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
Hi everybody!
Using 5.2.6. Professional

If there is a resulttable with duplicates like this:
1611757245360.png

("Kund Kunden Nr" AND "OS TRIM" are identically at least two times)

Is there any simple way to eleminate the redundant results in a further step?

Thanks for your time in advance!
 
Last edited:

Joery Korobejnik

Moderator - Spatial Eye
Confirmed GSA Customer
Joined
Nov 26, 2020
Messages
35
Hi Björn,

It seems like you want to do a kind of "group by" to eliminate the identical records. There are several ways of doing this, but it depends where the duplication is originated from.

If it is created in a join (and you know for sure that the records are identical, because this option will take the first record from the two) you can use the Lookup Type Lookup - Single Record (forced) from the Properties tab.

If it is already in your source and you can access it using SQL language, you can create a SQL based source collection where you perform the "group by" to eliminate the duplicate.

Kind regards,

Joery Korobejnik
 

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
Hi Björn,

It seems like you want to do a kind of "group by" to eliminate the identical records. There are several ways of doing this, but it depends where the duplication is originated from.

If it is created in a join (and you know for sure that the records are identical, because this option will take the first record from the two) you can use the Lookup Type Lookup - Single Record (forced) from the Properties tab.

If it is already in your source and you can access it using SQL language, you can create a SQL based source collection where you perform the "group by" to eliminate the duplicate.

Kind regards,

Joery Korobejnik

Hi Joery
Thanks a lot so far :)

the results shown above are generated by a M:N Join between Customer (Kunde Abnehmer) and Building (Hausanschluesse Ort Strasse).
at this moment, it is configured like this:

1611821234672.png

my thought was, that this is needed to be sure, to get all possible combinations and eleminate the duplicates afterwards.

By configuring like this:

1611821542534.png
I am not quite sure, that there will be all results I need.

In Example:
Customer "A" belongs to several buildings, some buildings are identically by adresse, but not by floor
"Lookup - Mehrere Datensätze" delivers something like this:
1611822179296.png
Kund Kunden Nr 5007056 in combination with Haus Objektschlüssel [.substring(0,10)] 562006015 is one of this cases to be viewed as duplicate.

Lookup - einzelner Datensatz (sorted by Kund Kundennummer)
1611822762263.png

there is one hit to Haus Objektschluessel [.substring(0,10)] = 562006015 .
This looks pretty suitable so far, but "Kund Kunden Nr" = 5007056 had another hit by "Haus Objektschlüssel" = 562006019 and this one is missing in the lower result table...
 

Attachments

  • 1611822347246.png
    1611822347246.png
    151.8 KB · Views: 1
Last edited:

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
for a better understanding:
joining this two tables creats a "new ID" (Kund Kunden Nr + Haus Objektschluessel.substring(0,10)). This "New ID" should be the unique one...
 

Joery Korobejnik

Moderator - Spatial Eye
Confirmed GSA Customer
Joined
Nov 26, 2020
Messages
35
Hi Björn,

Thank you, it is much clearer now what you want to accomplish. I do have some examples of similar cases but I solved it with business collections (via self join) and not the analyses blocks. I will discuss this with a couple of my collogues to see if the method can be used in this case without the risk of not having all the data in the result collection.

By the wat what is your data source?
 

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
Hi Björn,

Thank you, it is much clearer now what you want to accomplish. I do have some examples of similar cases but I solved it with business collections (via self join) and not the analyses blocks. I will discuss this with a couple of my collogues to see if the method can be used in this case without the risk of not having all the data in the result collection.

By the wat what is your data source?

The data source is an oracle DB (our ERP System)
 

Joery Korobejnik

Moderator - Spatial Eye
Confirmed GSA Customer
Joined
Nov 26, 2020
Messages
35
Hi Björn,

Thank you for the conformation. I will prepare an example with description on Monday. I want to make it a bit more general so that you can reuse the method for similar cases.

Have a great weekend,

Joery
 

Joery Korobejnik

Moderator - Spatial Eye
Confirmed GSA Customer
Joined
Nov 26, 2020
Messages
35
Hi Björn,

As you are using Oracle as a data source you can try the Query Tool (link to the online help) capability to eliminate the duplicate records. In SQL a function like group by or distinct can be used. I am trying to create an example using our Oracle XE database with similar data for you (it takes a bit longer to get the infrastructure set up). As you are using the professional edition and don't have this capability I will provide you a subscription license for a trial period so that you can set up this configuration (after that you can still use the project with the Professional edition, but you wont be able to configure it).

General introduction from the online help:

For the databases supported by GeoSpatial Analysis Ultimate Edition it is possible to use a query tool that allows for setting up queries and collections using the database's native SQL.

The query tool allows to:
  • Use a query editor to run ad-hoc, native SQL queries against a database
    This allows for performing ad-hoc queries against a database without the necessity to open an external tool. Though there is no intention to replace the functionality offered by these external tools (i.e. retrieving query/execution plans), the possibility to run native SQL queries directly from GeoSpatial Analysis will in many situations reduce the need for these external tools.

  • Create and save SQL based queries that can be used throughout the product like regular queries
    This allows to persist queries that have been created for use across sessions. These queries will be saved as part of the database's configuration in the project file and are later available from the query tool as well as the search ribbon.

  • Create and save SQL based queries as source collections that can be used like regular source collections
    This takes the previous step further by saving a created query as a collection. This collection will be available as part of the set of collections that the feature source provides to the end user, effectively acting as a database view of which the definition is stored in the project file instead of the database itself. With the possibility to use these collections as building blocks for business collections, effectively brings these queries/collections into the area of reporting and analysis.
Defining queries and collections using SQL allows for a bottom up approach towards data retrieval from databases. The databases for which SQL based queries are supported, are:
  • ESRI ArcSDE
  • GeoPackage
  • Microsoft SQL Server
  • MySQL
  • Oracle
  • PostGIS
  • SAP HANA
  • Spatial Warehouse
  • SQLite
Kind regards,

Joery
 

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
Hi Joery
I'm excited about this possibility shown above, but you may allow another question:
GSA is a wonderfull tool, 'cause there is (almost) no need for scripting to create results...
the thing, I want to get solved with my thread here is something similar to this Excel - function:
1612335157720.png

Is there any possibilty for upcoming versions, to rebuild a simple to use function like this within the analytics part of GSA Pro?
 

Joery Korobejnik

Moderator - Spatial Eye
Confirmed GSA Customer
Joined
Nov 26, 2020
Messages
35
Hi Björn,

That's unfortunately not something I can decide, but you can log it as a request via the GSA support system. There are also ways of doing it in GSA Professional but it is more a trick than a feature. Lets explore both ways with the current product and see what solves your challenge.

Joery
 

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
Hi Björn,

That's unfortunately not something I can decide, but you can log it as a request via the GSA support system. There are also ways of doing it in GSA Professional but it is more a trick than a feature. Lets explore both ways with the current product and see what solves your challenge.

Joery
Hi Joery
No Problem, I will start a request via support...
to be continued ;-)
 

Joery Korobejnik

Moderator - Spatial Eye
Confirmed GSA Customer
Joined
Nov 26, 2020
Messages
35
Hi Björn,

We have investigated multiple options with Oracle SQL statements and it seems that the window functions / analytic functions can provide the unique result you are looking for. Because of the duplicates you need some form of aggregation in your result.

Example we made by using an address table with duplicate street names in different municipalities:

SELECT STREET_NAME, MUNICIPALITY_NAME, Rank_in_street_group
FROM (SELECT STREET_NAME, MUNICIPALITY_NAME,
ROW_NUMBER() OVER(PARTITION by a.STREET_NAME ORDER BY 1) as Rank_in_street_group
FROM ADDRESSPORTAL.ADDRESS a)
WHERE Rank_in_street_group =
1

So to summarize:

1. Right click on the Oracle Feature Source and start the Query Tool
2. Prepare the SQL statement to yield the correct result (you can also perform joins with different tables here)
3. Save the query as a collection (it will become available in the feature source overview as a Table that you can promote towards a Business Collection)
4. Now you can use it as any other table within the product (also in your GSA Professional edition)

If you need any help getting the SQL statement set up please share your results here so that we can think with you.

Kind regards,

Joery
 

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
Hi Björn,

We have investigated multiple options with Oracle SQL statements and it seems that the window functions / analytic functions can provide the unique result you are looking for. Because of the duplicates you need some form of aggregation in your result.

Example we made by using an address table with duplicate street names in different municipalities:

SELECT STREET_NAME, MUNICIPALITY_NAME, Rank_in_street_group
FROM (SELECT STREET_NAME, MUNICIPALITY_NAME,
ROW_NUMBER() OVER(PARTITION by a.STREET_NAME ORDER BY 1) as Rank_in_street_group
FROM ADDRESSPORTAL.ADDRESS a)
WHERE Rank_in_street_group =
1

So to summarize:

1. Right click on the Oracle Feature Source and start the Query Tool
2. Prepare the SQL statement to yield the correct result (you can also perform joins with different tables here)
3. Save the query as a collection (it will become available in the feature source overview as a Table that you can promote towards a Business Collection)
4. Now you can use it as any other table within the product (also in your GSA Professional edition)

If you need any help getting the SQL statement set up please share your results here so that we can think with you.

Kind regards,

Joery

Hi Joery
It took a little time to work on this again, but now, I'm a step further, i guess:
with a little Help from a collegue, I've made a request like this
1613999951978.png

I guess, this will work... but I do not Know, if this is worth it, to buy an ultimate licence ;-)
 

Joery Korobejnik

Moderator - Spatial Eye
Confirmed GSA Customer
Joined
Nov 26, 2020
Messages
35
Hi Björn,

Great to see that it works. Now you can save and promote this script within your project and use it as a regular Business Collection with GSA Professional. I wanted to use the ultimate edition, just to provide you with the most clean solution to your problem.

To buy or not to buy is a different discussion and as long as you can continue with your project after this side step and you are happy with it than I am happy as well :)

Kind regards,

Joery
 

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
Thank you so much for your support! I guess, I need to improve my SQL skills "a little" ;-)
 

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
May I ask an additonal thing?
By saving a query like this:

1626090469314.png

(Not as data source)

where may i find / use this query within an analysis (or business object) if it is not stored as data (feature) source
It seems not to be the best way to save every possible query as an own datasource, or is that the given way to use the query tool?

If it is needed to check this box "Objekt":
1626093638252.png
I'm not able to, and I do not know, why....
So this saved query doesn't appear at any other point within GSA
 
Last edited:

Joery Korobejnik

Moderator - Spatial Eye
Confirmed GSA Customer
Joined
Nov 26, 2020
Messages
35
Hi Björn,

There is an article in the online help about this topic: Link to online help, you should able to promote this query as a Business Collection and use it as if it was a table from your data source.

kind regards,

Joery Korobejnik
 

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
Hi Joery

I already found that article, but unfortunately this is not working in my case: I can't create an BO using that saved query, 'cause the query does not appear at any single spot within GSA, except within the query tool itself.

Query tool:
1626154150705.png

BO Designer:
(I've checked all connected datasources)

1626154407373.png
 

Attachments

  • 1626154225455.png
    1626154225455.png
    123.9 KB · Views: 0

Frank Pistorius

Spatial Eye
Confirmed GSA Customer
Joined
May 26, 2020
Messages
36
Hi Björn,

What you could try is checking the 'Save as source collection' flag upon saving the query.

1626168616051.png

This should result in a source collection (in this feature source) with that name.
1626168727134.png

You can then promote this source collection to a business collection as follows:
1626168745236.png

(Alternatively, you can just create a business collection and include the source collection from there.)

The resulting business collection can then be used like any other business collection.
1626168833638.png

You can add/change/remove fields in the business collection, like I've done with the buffered geometry field.

Hope this helps in setting this up.

Regards,

Frank
 

Björn Schaefer

Member
Confirmed GSA Customer
Joined
Oct 26, 2020
Messages
91
that works fine... maybe I missunderstood the option
1626180777236.png
"als Elementquelle speichern"
I thought, this would save a complete new feature source instead of a table within an existing feature source...

Thanks a lot again to all of you for taking time...
 
Top