Database retrieval and population methods¶
Python3 script for retrieving data from MCSQ database Before running the script, install requirements: pandas, numpy, SQLAlchemy, psycopg2 Author: Danielly Sorato Author contact: danielly.sorato@gmail.com
-
retrieve_from_tables.
build_id_dicts_per_language
(language)[source]¶ Gets all text segments and their IDs and builds a dictionary by item type.
- Parameters
language (param1) – target language.
- Returns
Four different dictionaries (one for each item type). The IDs are the keys and the text segments are the values.
-
retrieve_from_tables.
create_tagged_text_dict
(id_list)[source]¶ Gets the survey_itemid and the POS tagged text from the survey_item table and creates a dictionary.
- Parameters
id_list (param1) – a language specific list of the target segment IDs in the alignment table.
- Returns
A dictionary with target survey_itemids as keys and POS tagged text as values.
-
retrieve_from_tables.
get_ids_from_alignment_table
(survey_itemid)[source]¶ Gets all IDs (either source or target) from the alignment table.
- Parameters
survey_itemid (param1) – name of the column indicating if the desired IDs to be retrived are from source or from target.
- Returns
A list of survey_itemids.
-
retrieve_from_tables.
get_ids_from_alignment_table_per_language
(language)[source]¶ Gets all target IDs from the alignment table based on the language.
- Parameters
language (param1) – target language.
- Returns
A list of all target_survey_itemids in the alignment table.
-
retrieve_from_tables.
get_instruction_id
(text)[source]¶ Gets an instruction segment ID based on its text.
- Parameters
text (param1) – the instruction segment text.
- Returns
instruction segment ID (int).
-
retrieve_from_tables.
get_introduction_id
(text)[source]¶ Gets an introduction segment ID based on its text.
- Parameters
text (param1) – the introduction segment text.
- Returns
introduction segment ID (int).
-
retrieve_from_tables.
get_module_id
(module_name)[source]¶ Gets an module ID based on its name.
- Parameters
module_name (param1) – the name of the module.
- Returns
response module ID (int).
-
retrieve_from_tables.
get_request_id
(text)[source]¶ Gets an request segment ID based on its text.
- Parameters
text (param1) – the request segment text.
- Returns
request segment ID (int).
-
retrieve_from_tables.
get_response_id
(text, item_value)[source]¶ Gets an response segment ID based on its text.
- Parameters
text (param1) – the response segment text.
- Returns
response segment ID (int).
-
retrieve_from_tables.
get_tagged_text_from_survey_item_table
()[source]¶ Gets the survey_itemid and the POS tagged text from the survey_item table and creates a dictionary.
- Returns
A dictionary with survey_itemids as keys and POS tagged text as values.
Python3 script for ESS dataset inclusion in the MCSQ database Before running the script, install requirements: pandas, numpy, SQLAlchemy, psycopg2 Author: Danielly Sorato Author contact: danielly.sorato@gmail.com
-
populate_tables.
tag_alignment_table
(dictionary, id_list, column_name, source_or_target_id)[source]¶ Inserts the POS alignment annotation either on the target or the source text column.
- Parameters
dictionary (param1) – a dictionary where the keys are the survey_itemids and the values are the pos tagged text segments.
id_list (param2) – list of the IDs that refers to the text to be annotated.
column_name (param3) – defines if the column to be tagged is the source or the target
source_or_target_id (param4) – name of the ID (either target_survey_itemid or source_survey_itemid)
-
populate_tables.
tag_item_type_table
(dictionary, table_name, table_id_name)[source]¶ Inserts the POS alignment annotation in item type specific table.
- Parameters
dictionary (param1) – a dictionary where the keys are the survey_itemids and the values are the pos tagged text segments.
table_name (param2) – name of the table to be tagged (introduction, instruction, request or response).
table_id_name (param3) – name of the ID of the table.
-
populate_tables.
tag_survey_item
(dictionary, table_id_name)[source]¶ Inserts the POS alignment annotation in survey_item table.
- Parameters
dictionary (param1) – an item type specific dictionary where the keys are the IDs and the values are the pos tagged text segments.
table_id_name (param2) – name of the ID of item type specific the table.