~/Desktop/work/doidata noref/add-normalization-export-scripts-notices 30s
doidata-NIroBdgM-py3.10 ❯ PYTHONPATH=. python ./doidata/research/analyze_normalized_notices_export.py
(558769, 7)
                                     doi  ...                         combined_doi_notice_status
0                       10.1038/148226a0  ...                    10.1038/148226a0::::Has erratum
1                       10.1038/156626b0  ...                    10.1038/156626b0::::Has erratum
2            10.1016/j.aller.2014.11.002  ...         10.1016/j.aller.2014.11.002::::Has erratum
3  10.17504/protocols.io.eq2lyn2kwvx9/v1  ...  10.17504/protocols.io.eq2lyn2kwvx9/v1::::Retra...
4           10.1371/journal.pcbi.1010113  ...        10.1371/journal.pcbi.1010113::::New Version

[5 rows x 7 columns]

Total from scite:  159764
Total from pubmed:  180510
Total from crossref:  218495

Unique notices in withdrawn/retracted DF:  ['Retracted' 'Withdrawn']
Unique notices in 'other notices' DF:  ['Has erratum' 'New Version' 'Has correction' 'Has expression of concern'
 'Comment' 'New Edition' 'Clarification' 'Addendum' 'Article'
 'Note Discuss' 'Publisher Note' 'Contributed Paper' 'Unknown'
 'Invited Paper' 'Peer Confarticles' 'Tg Report' 'Point Counterpoint'
 'Sample Update' 'This Is Some Update 23' 'Invited Article' 'Tutorial'
 'Interesting Update' 'Print' 'Correspondence' 'Book Review'
 'Communications']

notices in all sources:  {'total_in_all_sources': 7801, 'num_other_notices_all_sources': 4556, 'num_withdrawn_retracted_notices_all_sources': 3245}

crossref intersect pubmed but not scite:  {'total_from_crossref_intersect_pubmed_not_scite': 17299, 'num_other_notices_crossref_intersect_pubmed_not_scite': 16210, 'num_withdrawn_retracted_notices_crossref_intersect_pubmed_not_scite': 1089}
crossref intersect scite not pubmed:  {'total_from_crossref_intersect_scite_not_pubmed': 55391, 'num_other_notices_crossref_intersect_scite_not_pubmed': 51867, 'num_withdrawn_retracted_notices_crossref_intersect_scite_not_pubmed': 3524}
pubmed intersect scite not crossref:  {'total_from_pubmed_intersect_scite_not_crossref': 8445, 'num_other_notices_pubmed_intersect_scite_not_crossref': 6147, 'num_withdrawn_retracted_notices_pubmed_intersect_scite_not_crossref': 2298}

scite only:  {'total_from_scite_only': 84415, 'num_other_notices_scite_only': 52112, 'num_withdrawn_retracted_notices_scite_only': 32303}
pubmed only:  {'total_from_pubmed_only': 146891, 'num_other_notices_pubmed_only': 142708, 'num_withdrawn_retracted_notices_pubmed_only': 4183}
crossref only:  {'total_from_crossref_only': 133575, 'num_other_notices_crossref_only': 130730, 'num_withdrawn_retracted_notices_crossref_only': 2845}

total unique across sources:  453817

*** -- Counts for Table -- **
Counts for CROSSREF
Addendum:  {'total': 1031, 'unique_to_CROSSREF': 1018}
** Writing results to file: doidata/research/tmp/unique_addendum_for_crossref.csv
Clarification:  {'total': 352, 'unique_to_CROSSREF': 352}
** Writing results to file: doidata/research/tmp/unique_clarif_for_crossref.csv
Comments:  {'total': 12, 'unique_to_CROSSREF': 12}
** Writing results to file: doidata/research/tmp/unique_comments_for_crossref.csv
Errata:  {'total': 53422, 'unique_to_CROSSREF': 24583}
** Writing results to file: doidata/research/tmp/unique_errata_for_crossref.csv
Corrections:  {'total': 117529, 'unique_to_CROSSREF': 70421}
** Writing results to file: doidata/research/tmp/unique_corr_for_crossref.csv
Expressions of Concern:  {'total': 901, 'unique_to_CROSSREF': 560}
** Writing results to file: doidata/research/tmp/unique_eoc_for_crossref.csv
Retractions:  {'total': 9125, 'unique_to_CROSSREF': 1009}
** Writing results to file: doidata/research/tmp/unique_retr_for_crossref.csv
Withdrawals:  {'total': 2290, 'unique_to_CROSSREF': 1836}
** Writing results to file: doidata/research/tmp/unique_withdrawals_for_crossref.csv
** Done **

Counts for PUBMED
Addendum:  {'total': 0, 'unique_to_PUBMED': 0}
** Writing results to file: doidata/research/tmp/unique_addendum_for_pubmed.csv
Clarification:  {'total': 0, 'unique_to_PUBMED': 0}
** Writing results to file: doidata/research/tmp/unique_clarif_for_pubmed.csv
Comments:  {'total': 33027, 'unique_to_PUBMED': 33027}
** Writing results to file: doidata/research/tmp/unique_comments_for_pubmed.csv
Errata:  {'total': 135185, 'unique_to_PUBMED': 108608}
** Writing results to file: doidata/research/tmp/unique_errata_for_pubmed.csv
Corrections:  {'total': 0, 'unique_to_PUBMED': 0}
** Writing results to file: doidata/research/tmp/unique_corr_for_pubmed.csv
Expressions of Concern:  {'total': 1409, 'unique_to_PUBMED': 1073}
** Writing results to file: doidata/research/tmp/unique_eoc_for_pubmed.csv
Retractions:  {'total': 10889, 'unique_to_PUBMED': 4183}
** Writing results to file: doidata/research/tmp/unique_retr_for_pubmed.csv
Withdrawals:  {'total': 0, 'unique_to_PUBMED': 0}
** Writing results to file: doidata/research/tmp/unique_withdrawals_for_pubmed.csv
** Done **

Counts for SCITE
Addendum:  {'total': 0, 'unique_to_SCITE': 0}
** Writing results to file: doidata/research/tmp/unique_addendum_for_scite.csv
Clarification:  {'total': 0, 'unique_to_SCITE': 0}
** Writing results to file: doidata/research/tmp/unique_clarif_for_scite.csv
Comments:  {'total': 0, 'unique_to_SCITE': 0}
** Writing results to file: doidata/research/tmp/unique_comments_for_scite.csv
Errata:  {'total': 62134, 'unique_to_SCITE': 43248}
** Writing results to file: doidata/research/tmp/unique_errata_for_scite.csv
Corrections:  {'total': 52549, 'unique_to_SCITE': 8864}
** Writing results to file: doidata/research/tmp/unique_corr_for_scite.csv
Expressions of Concern:  {'total': 0, 'unique_to_SCITE': 0}
** Writing results to file: doidata/research/tmp/unique_eoc_for_scite.csv
Retractions:  {'total': 34801, 'unique_to_SCITE': 22701}
** Writing results to file: doidata/research/tmp/unique_retr_for_scite.csv
Withdrawals:  {'total': 10280, 'unique_to_SCITE': 9602}
** Writing results to file: doidata/research/tmp/unique_withdrawals_for_scite.csv
** Done **