Published May 13, 2024 | Version 1.0.0
Software Open

Lithology Keyword Automation Tool (LithoKAT)

  • 1. University of Nebraska-Lincoln

Description

This set of Python scripts automates the standardization of lithology in borehole databases. It converts unstandardized terms (raw descriptions) to standardized terms (keywords). It is particularly useful for databases where no previous standardization of lithology keywords were used. The dictionaries are customizable.

The scripts should be run in the order provided by the leading numbers (from 00 to 06). Example data are included in the files.

00_stack_horizontals.py: For rows in the database where multiple intervals were entered on a single line (horizontals), extracts those intervals, computes their depths, and stacks them into individual rows.

01_convert_lowercase.py: Converts all descriptions to lowercase to simplify the use of code in the following scripts.

02_reverse_adj_noun_pairs.py: Takes leading adjectives in adjective-noun pairs, such as "sandy clay", or "clayey sand", and places them after the noun, yielding "clay, sandy" or "sand, clayey". This simplifies the use of dictionaries in the following scripts.

03_extract_keywords.py: Uses a dictionary to look for terms and returns the place (as an integer) of that term in the string. This place is used to rank the primary and secondary lithology terms.

04_rank_sort_keywords.py: Ranks the terms by their place in the string, then replaces the term by the keyword from the dictionary.

05_tag_bedrock_tops.py: Optional script to find bedrock terms. Can be used as a first pass operation for mapping the bedrock surface.

06_convert_units.py: Converts U.S. standard to metric depth units.

Files

adjective-noun_dictionary.txt

Files (126.9 kB)

Name Size Download all
md5:89f2c87acb682bc69c1e9aa24a32fc6e
2.4 kB Download
md5:3986b4eda21175f3b361e16d1d963c6a
159 Bytes Download
md5:338659af2f7457723f1f85f29ddaa005
549 Bytes Download
md5:affcb17b76ae629228dfb80e31152eeb
2.1 kB Download
md5:6e8ad9535524c96b00b81f1b8133d461
3.6 kB Download
md5:65eab92ebf8b467043df6d786c61e9b7
2.2 kB Download
md5:7dc84c66dc99d307366a1272bb2d601d
1.8 kB Download
md5:eb867ff73eaad3e5b4d7725b470d5dc3
4.3 kB Preview Download
md5:dbf437bc70faf7c2f31d76215ff18ae1
849 Bytes Preview Download
md5:29b3fde24be9864b3be81a65ad0b7fe9
36.0 kB Preview Download
md5:f2c8ca06ba597faf7c24473c1280c46f
2.8 kB Preview Download
md5:686d3c0dcea69ecc28c4957f3ec77188
2.2 kB Preview Download
md5:103be397d67434ad79688daac4eeb1c8
2.5 kB Preview Download
md5:a738909827f94a792b815e275aa3be7e
2.8 kB Preview Download
md5:12d02c6c4c54f1452a5655267b9837ca
3.4 kB Preview Download
md5:e8a706d058a8f73135ff24bdeaa7a8da
2.9 kB Preview Download
md5:4b92814098560b1b0071561aa59a623a
696 Bytes Preview Download
md5:cdeea600cb3715797cc1816e4b6506bb
2.0 kB Preview Download
md5:02213063130e1a89623a993cc7529a93
53.8 kB Preview Download

Additional details

Software

Development Status
Active