Overview

Dataset statistics

Number of variables13
Number of observations31717
Missing cells221917
Missing cells (%)53.8%
Total size in memory4.7 MiB
Average record size in memory155.0 B

Variable types

Numeric8
Unsupported4
Categorical1

Alerts

country_cd has constant value "BE" Constant
socecon_lvl_cd has 31717 (100.0%) missing values Missing
country_origin_cd has 31717 (100.0%) missing values Missing
hospital_id has 31717 (100.0%) missing values Missing
time_dx_to_surgery_nm has 6506 (20.5%) missing values Missing
time_dx_to_radiotherapy_nm has 31537 (99.4%) missing values Missing
time_dx_to_chemotherapy_nm has 26193 (82.6%) missing values Missing
time_dx_to_immunotherapy_nm has 31717 (100.0%) missing values Missing
time_dx_to_hormonotherapy_nm has 30813 (97.1%) missing values Missing
socecon_lvl_cd is an unsupported type, check if it needs cleaning or further analysis Unsupported
country_origin_cd is an unsupported type, check if it needs cleaning or further analysis Unsupported
hospital_id is an unsupported type, check if it needs cleaning or further analysis Unsupported
time_dx_to_immunotherapy_nm is an unsupported type, check if it needs cleaning or further analysis Unsupported
time_dx_to_surgery_nm has 745 (2.3%) zeros Zeros

Reproduction

Analysis started2022-05-04 13:14:07.260889
Analysis finished2022-05-04 13:14:07.635561
Duration0.37 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

patient_id
Real number (ℝ≥0)

Distinct30809
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49999930.15
Minimum10005351
Maximum89998320
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size247.9 KiB
2022-05-04T15:14:07.852954image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum10005351
5-th percentile14000708
Q129703003
median50009072
Q370080011
95-th percentile86073633
Maximum89998320
Range79992969
Interquartile range (IQR)40377008

Descriptive statistics

Standard deviation23231198.84
Coefficient of variation (CV)0.4646246258
Kurtosis-1.215843051
Mean49999930.15
Median Absolute Deviation (MAD)20180810
Skewness0.001782261476
Sum1.585847785 × 1012
Variance5.396885994 × 1014
MonotonicityNot monotonic
2022-05-04T15:14:08.020475image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
841882963
 
< 0.1%
233603252
 
< 0.1%
632589402
 
< 0.1%
880282792
 
< 0.1%
894976992
 
< 0.1%
672562782
 
< 0.1%
130542202
 
< 0.1%
121874672
 
< 0.1%
164833232
 
< 0.1%
146128352
 
< 0.1%
Other values (30799)31696
99.9%
ValueCountFrequency (%)
100053511
< 0.1%
100070151
< 0.1%
100133911
< 0.1%
100177781
< 0.1%
100208001
< 0.1%
100208161
< 0.1%
100210071
< 0.1%
100210801
< 0.1%
100231561
< 0.1%
100253661
< 0.1%
ValueCountFrequency (%)
899983201
< 0.1%
899918741
< 0.1%
899908681
< 0.1%
899896681
< 0.1%
899891881
< 0.1%
899873231
< 0.1%
899865711
< 0.1%
899862351
< 0.1%
899806641
< 0.1%
899805791
< 0.1%

age_nm
Real number (ℝ≥0)

Distinct58
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean61.97856039
Minimum21
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size247.9 KiB
2022-05-04T15:14:08.199918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile42
Q154
median63
Q371
95-th percentile78
Maximum80
Range59
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.14975541
Coefficient of variation (CV)0.1798969731
Kurtosis-0.4824289114
Mean61.97856039
Median Absolute Deviation (MAD)9
Skewness-0.4307372338
Sum1965774
Variance124.3170457
MonotonicityNot monotonic
2022-05-04T15:14:08.360424image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
671073
 
3.4%
691054
 
3.3%
721021
 
3.2%
661011
 
3.2%
711008
 
3.2%
68992
 
3.1%
65983
 
3.1%
70978
 
3.1%
63973
 
3.1%
64946
 
3.0%
Other values (48)21678
68.3%
ValueCountFrequency (%)
211
 
< 0.1%
232
 
< 0.1%
245
 
< 0.1%
263
 
< 0.1%
2714
 
< 0.1%
2814
 
< 0.1%
2927
0.1%
3044
0.1%
3140
0.1%
3250
0.2%
ValueCountFrequency (%)
80549
1.7%
79701
2.2%
78764
2.4%
77813
2.6%
76853
2.7%
75920
2.9%
74899
2.8%
73930
2.9%
721021
3.2%
711008
3.2%

socecon_lvl_cd
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing31717
Missing (%)100.0%
Memory size247.9 KiB

country_cd
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
BE
31717 

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBE
2nd rowBE
3rd rowBE
4th rowBE
5th rowBE

Common Values

ValueCountFrequency (%)
BE31717
100.0%
ValueCountFrequency (%)
be31717
100.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

country_origin_cd
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing31717
Missing (%)100.0%
Memory size247.9 KiB

hospital_id
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing31717
Missing (%)100.0%
Memory size247.9 KiB

ttm_type_cd
Real number (ℝ≥0)

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.461456002
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size247.9 KiB
2022-05-04T15:14:08.501393image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum5
Range4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9626675331
Coefficient of variation (CV)0.6587044236
Kurtosis3.230282628
Mean1.461456002
Median Absolute Deviation (MAD)0
Skewness1.995395754
Sum46353
Variance0.9267287792
MonotonicityNot monotonic
2022-05-04T15:14:08.605577image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=4)
ValueCountFrequency (%)
125199
79.4%
35481
 
17.3%
5879
 
2.8%
2158
 
0.5%
ValueCountFrequency (%)
125199
79.4%
2158
 
0.5%
35481
 
17.3%
5879
 
2.8%
ValueCountFrequency (%)
5879
 
2.8%
35481
 
17.3%
2158
 
0.5%
125199
79.4%

time_dx_to_surgery_nm
Real number (ℝ≥0)

MISSING
ZEROS

Distinct173
Distinct (%)0.7%
Missing6506
Missing (%)20.5%
Infinite0
Infinite (%)0.0%
Mean26.10495419
Minimum0
Maximum272
Zeros745
Zeros (%)2.3%
Negative0
Negative (%)0.0%
Memory size247.9 KiB
2022-05-04T15:14:08.749801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7
Q115
median23
Q333
95-th percentile54
Maximum272
Range272
Interquartile range (IQR)18

Descriptive statistics

Standard deviation17.09837172
Coefficient of variation (CV)0.654985701
Kurtosis21.31600807
Mean26.10495419
Median Absolute Deviation (MAD)8
Skewness3.001671907
Sum658132
Variance292.3543154
MonotonicityNot monotonic
2022-05-04T15:14:08.913058image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
211180
 
3.7%
141140
 
3.6%
151135
 
3.6%
201011
 
3.2%
221007
 
3.2%
28887
 
2.8%
19868
 
2.7%
13773
 
2.4%
16771
 
2.4%
27769
 
2.4%
Other values (163)15670
49.4%
(Missing)6506
20.5%
ValueCountFrequency (%)
0745
2.3%
140
 
0.1%
222
 
0.1%
326
 
0.1%
449
 
0.2%
543
 
0.1%
6115
 
0.4%
7250
 
0.8%
8410
1.3%
9386
1.2%
ValueCountFrequency (%)
2721
< 0.1%
2621
< 0.1%
2381
< 0.1%
2361
< 0.1%
2241
< 0.1%
2221
< 0.1%
2211
< 0.1%
2151
< 0.1%
2141
< 0.1%
2081
< 0.1%

time_dx_to_radiotherapy_nm
Real number (ℝ≥0)

MISSING

Distinct52
Distinct (%)28.9%
Missing31537
Missing (%)99.4%
Infinite0
Infinite (%)0.0%
Mean18.14444444
Minimum0
Maximum78
Zeros14
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size247.9 KiB
2022-05-04T15:14:09.082654image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16
median13.5
Q327
95-th percentile50.05
Maximum78
Range78
Interquartile range (IQR)21

Descriptive statistics

Standard deviation15.88510501
Coefficient of variation (CV)0.875480374
Kurtosis1.396895578
Mean18.14444444
Median Absolute Deviation (MAD)9
Skewness1.258983381
Sum3266
Variance252.3365611
MonotonicityNot monotonic
2022-05-04T15:14:09.250207image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
014
 
< 0.1%
1111
 
< 0.1%
68
 
< 0.1%
47
 
< 0.1%
87
 
< 0.1%
137
 
< 0.1%
57
 
< 0.1%
246
 
< 0.1%
196
 
< 0.1%
146
 
< 0.1%
Other values (42)101
 
0.3%
(Missing)31537
99.4%
ValueCountFrequency (%)
014
< 0.1%
14
 
< 0.1%
23
 
< 0.1%
35
 
< 0.1%
47
< 0.1%
57
< 0.1%
68
< 0.1%
75
 
< 0.1%
87
< 0.1%
94
 
< 0.1%
ValueCountFrequency (%)
781
< 0.1%
671
< 0.1%
651
< 0.1%
631
< 0.1%
621
< 0.1%
571
< 0.1%
561
< 0.1%
541
< 0.1%
511
< 0.1%
502
< 0.1%

time_dx_to_chemotherapy_nm
Real number (ℝ≥0)

MISSING

Distinct159
Distinct (%)2.9%
Missing26193
Missing (%)82.6%
Infinite0
Infinite (%)0.0%
Mean27.5923244
Minimum0
Maximum266
Zeros121
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size247.9 KiB
2022-05-04T15:14:09.422769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7
Q117
median24
Q332
95-th percentile53
Maximum266
Range266
Interquartile range (IQR)15

Descriptive statistics

Standard deviation22.99481522
Coefficient of variation (CV)0.8333772425
Kurtosis32.63754422
Mean27.5923244
Median Absolute Deviation (MAD)8
Skewness4.844563596
Sum152420
Variance528.7615272
MonotonicityNot monotonic
2022-05-04T15:14:09.586577image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21280
 
0.9%
28242
 
0.8%
22220
 
0.7%
14217
 
0.7%
20208
 
0.7%
27207
 
0.7%
29194
 
0.6%
15190
 
0.6%
18179
 
0.6%
23178
 
0.6%
Other values (149)3409
 
10.7%
(Missing)26193
82.6%
ValueCountFrequency (%)
0121
0.4%
122
 
0.1%
211
 
< 0.1%
318
 
0.1%
413
 
< 0.1%
523
 
0.1%
644
 
0.1%
756
0.2%
852
0.2%
965
0.2%
ValueCountFrequency (%)
2661
< 0.1%
2451
< 0.1%
2421
< 0.1%
2301
< 0.1%
2281
< 0.1%
2271
< 0.1%
2251
< 0.1%
2241
< 0.1%
2231
< 0.1%
2141
< 0.1%

time_dx_to_immunotherapy_nm
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing31717
Missing (%)100.0%
Memory size247.9 KiB

time_dx_to_hormonotherapy_nm
Real number (ℝ≥0)

MISSING

Distinct95
Distinct (%)10.5%
Missing30813
Missing (%)97.1%
Infinite0
Infinite (%)0.0%
Mean23.33185841
Minimum0
Maximum205
Zeros53
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size247.9 KiB
2022-05-04T15:14:09.752234image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19
median17
Q330
95-th percentile60
Maximum205
Range205
Interquartile range (IQR)21

Descriptive statistics

Standard deviation23.3409047
Coefficient of variation (CV)1.000387723
Kurtosis15.1769184
Mean23.33185841
Median Absolute Deviation (MAD)10
Skewness3.118101876
Sum21092
Variance544.7978322
MonotonicityNot monotonic
2022-05-04T15:14:09.908197image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
053
 
0.2%
738
 
0.1%
1337
 
0.1%
1535
 
0.1%
1435
 
0.1%
1033
 
0.1%
633
 
0.1%
2130
 
0.1%
2028
 
0.1%
1726
 
0.1%
Other values (85)556
 
1.8%
(Missing)30813
97.1%
ValueCountFrequency (%)
053
0.2%
112
 
< 0.1%
26
 
< 0.1%
311
 
< 0.1%
415
 
< 0.1%
513
 
< 0.1%
633
0.1%
738
0.1%
826
0.1%
925
0.1%
ValueCountFrequency (%)
2051
< 0.1%
1931
< 0.1%
1831
< 0.1%
1781
< 0.1%
1521
< 0.1%
1411
< 0.1%
1341
< 0.1%
1331
< 0.1%
1261
< 0.1%
1212
< 0.1%

period
Real number (ℝ≥0)

Distinct48
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.48380994
Minimum1
Maximum48
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size247.9 KiB
2022-05-04T15:14:10.193493image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q114
median26
Q337
95-th percentile47
Maximum48
Range47
Interquartile range (IQR)23

Descriptive statistics

Standard deviation13.65553626
Coefficient of variation (CV)0.5358514402
Kurtosis-1.170318625
Mean25.48380994
Median Absolute Deviation (MAD)12
Skewness-0.04868959517
Sum808270
Variance186.4736706
MonotonicityNot monotonic
2022-05-04T15:14:10.357459image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
39835
 
2.6%
47816
 
2.6%
45802
 
2.5%
48799
 
2.5%
46795
 
2.5%
34780
 
2.5%
31779
 
2.5%
25762
 
2.4%
36749
 
2.4%
27742
 
2.3%
Other values (38)23858
75.2%
ValueCountFrequency (%)
1198
 
0.6%
2563
1.8%
3740
2.3%
4557
1.8%
5651
2.1%
6704
2.2%
7579
1.8%
8627
2.0%
9551
1.7%
10638
2.0%
ValueCountFrequency (%)
48799
2.5%
47816
2.6%
46795
2.5%
45802
2.5%
44653
2.1%
43733
2.3%
42576
1.8%
41353
1.1%
40552
1.7%
39835
2.6%