GSA 5.2.7.1 replace special signs within a string

Björn Schaefer · Jun 22, 2021

Hi there

there is a foreign key source, containing something like this:

gis.ega_ha_leitungsabschnitt.85249653Smallworld

I need to extract the ID in the middle ...

with replace("string"."") I may delete the given gis.ega_ha_leitungsabschnitt. and smallworld
but I have no idea, how to eliminate this sign , not even knowing, what Sign this is.

substring methods are known, but i found nothing about eliminating the last sign...

Any ideas ?

Frank Pistorius · Jun 22, 2021

Hi Björn,

Can you explain what you mean by Sign? Do you mean there are potentially other characters in there like commas (,)?

And, will the ID always be the part with digits and the rest of the string will always be without digits?

Thanks,

Frank

Björn Schaefer · Jun 22, 2021

sign may be the wrong translation, in german we call it "Sonderzeichen"...

the parts in red needs to be deletet, the green marked part needs to be extracted for further steps...

There ist "something" in front of "smallworld"

and this is neither a letter nor a number, it is an ascii coded something... but I need to delete this, 'cause the desired result should be this:

85249653

but this delivers

Frank Pistorius · Jun 22, 2021

Hi Björn,

Assuming the ID consists of digits only and there are no other digits in the surrounding text, this can be handled using a regular expression.

I've set up an Excel sheet with some sample data. In my case, I've added a special character in there as well.

The derived business collection looks like this:

This business collection has a couple of (test) fields added.

IDText
This field contains the ID as a string and is set up using a regular expression.

# Texts.Text.RegexSubString("[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]")
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# A group getting exactly 8 digits.
#
# Texts.Text.RegexSubString("[0-9]+", 0)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Picking the first match matching the digits-group of any length (the +).
# Indexing is zero-based and with 0 being the default, it is omitted in
# the expression below.
#
# The subString is added to indicate that the string will
# be no longer than 10 characters. Otherwise the field-type is Memo.
Texts.Text.RegexSubString("[0-9]+").SubString(0, 10)

The regular expression used finds all groups containing one or more digits. It picks the first matching group.

ID
This is the number you are looking for and is basically the same expression, but with the result converted to a number (in this case an integer).
This is probably the only field you are interested in for your scenario.

# Texts.Text.RegexSubString("[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]")
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# A group getting exactly 8 digits.
#
# Texts.Text.RegexSubString("[0-9]+", 0)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Any number of digits by picking the first match matching the digits-group.
# Indexing is zero-based and with 0 being the default, it is omitted in
# the expression below.
#
# In case the number doesn't exceed the integer range, use the
# ToInteger() method. Otherwise, the ToLong() could be used
Texts.Text.RegexSubString("[0-9]+").ToInteger()

PartBeforeID
In case you are interested in other parts in the string, you can find all groups matching non-digits.
In our case, we are looking for the first group matching non-digits, which leads to the part before the ID.

# Texts.Text.RegexSubString("[^0-9]+", 0)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Gets the first element not matching the digits group.
#
# Indexing is zero-based and with 0 being the default, it is omitted
#
# The subString is added to indicate that the string will
# be no longer than 30 characters. Otherwise the field-type is Memo.
Texts.Text.RegexSubString("[^0-9]+").SubString(0, 30)

Note the use of the ^ character in the regular expression, to denote a not, yielding groups not matching the digits.

PartAfterID
A similar expression can be used to retrieve the second group not containing digits. In this specific case, this would lead to the group following the ID.

# Texts.Text.RegexSubString("[^0-9]+", 1)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# gets the second first element not matching the digits group.
#
# Indexing is zero-based, so this picks the second group
#
# The subString is added to indicate that the string will
# be no longer than 30 characters. Otherwise the field-type is Memo.
Texts.Text.RegexSubString("[^0-9]+", 1).SubString(0, 30)

Those fields are probably not of interest but added for completeness.

Browsing the collection:

So, in your case, you can use the regular expression shown in the ID field:

Texts.Text.RegexSubString("[0-9]+").ToInteger()

In your situation this would be:

Fremdschluessel.[Frm Key].RegexSubString("[0-9]+").ToInteger()

Hopefully, the assumption about the digits is correct (ID consists of digits only, the other parts do not contain digits).
In that case, the above expression should do the job.

The project that was set up for this example has been attached.

Regards,

Frank

Björn Schaefer · Jun 22, 2021

Hi Frank

Thanks a lot again... ;-) This works properly, indeed there are just digits for the Object ID...

I've tried to find something like "cut the LAST n digits" by substring methods, but couldn't find something suitable....

By the way: I recognized, that the "special sign" isn't shown in my initial thread... sorry for that, Didn't use the preview function

Have a great week...!!

Frank Pistorius · Jun 22, 2021

Great to hear this applies to your situation. You're welcome.

Frank Pistorius · Jun 22, 2021

Hi Björn,

Since you mentioned the possibility to cut the last part of the string, please find an alternative expression below.
This doesn't use regular expressions but does the work more directly using the known text parts.

And as text, for easier copy-pasting:

{
# Set the end text
var lastText = "Smallworld";
# Find the index in the string of this end text
# Subtract 1 for the additional character
var lastTextIndex = Texts.Text.IndexOf(lastText) - 1;
# Get the string with the last part removed
# (remove from start-index, till the end so no length specified)
var lastPartRemoved = Texts.Text.Remove(lastTextIndex);

# Set first text
var firstText = "gis.ega_ha_leitungsabschnitt.";

# Get length of the first text
var firstTextLength = firstText.Length;

# Get the substring starting at this index
var digitsText = lastPartRemoved.SubString(firstTextLength);

# Return the digitsText as an integer
return digitsText.ToInteger();
}

This is very explicitly retrieving the digits from the string by removing the last and first text bits.
Though this does the job as well, I prefer the regular expressions based alternative because it's way simpler.

But maybe the additional expressions used here are of use in other circumstances.

Regards,

Frank

Search

Welcome to the GSA Community!

GSA 5.2.7.1 replace special signs within a string

Björn Schaefer

Member

Frank Pistorius

Spatial Eye

Björn Schaefer

Member

Frank Pistorius

Spatial Eye

Attachments

Björn Schaefer

Member

Frank Pistorius

Spatial Eye

Frank Pistorius

Spatial Eye

Other useful GSA resources

Online statistics

About Us

Stay Connected