Dataset statistics
Number of variables | 13 |
---|---|
Number of observations | 31717 |
Missing cells | 221917 |
Missing cells (%) | 53.8% |
Total size in memory | 4.7 MiB |
Average record size in memory | 155.0 B |
Variable types
Numeric | 8 |
---|---|
Unsupported | 4 |
Categorical | 1 |
country_cd has constant value "BE" | Constant |
socecon_lvl_cd has 31717 (100.0%) missing values | Missing |
country_origin_cd has 31717 (100.0%) missing values | Missing |
hospital_id has 31717 (100.0%) missing values | Missing |
time_dx_to_surgery_nm has 6506 (20.5%) missing values | Missing |
time_dx_to_radiotherapy_nm has 31537 (99.4%) missing values | Missing |
time_dx_to_chemotherapy_nm has 26193 (82.6%) missing values | Missing |
time_dx_to_immunotherapy_nm has 31717 (100.0%) missing values | Missing |
time_dx_to_hormonotherapy_nm has 30813 (97.1%) missing values | Missing |
socecon_lvl_cd is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
country_origin_cd is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
hospital_id is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
time_dx_to_immunotherapy_nm is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
time_dx_to_surgery_nm has 745 (2.3%) zeros | Zeros |
Reproduction
Analysis started | 2022-05-04 13:14:07.260889 |
---|---|
Analysis finished | 2022-05-04 13:14:07.635561 |
Duration | 0.37 seconds |
Software version | pandas-profiling v3.1.0 |
Download configuration | config.json |
patient_id
Real number (ℝ≥0)
Distinct | 30809 |
---|---|
Distinct (%) | 97.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 49999930.15 |
Minimum | 10005351 |
---|---|
Maximum | 89998320 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 247.9 KiB |
Quantile statistics
Minimum | 10005351 |
---|---|
5-th percentile | 14000708 |
Q1 | 29703003 |
median | 50009072 |
Q3 | 70080011 |
95-th percentile | 86073633 |
Maximum | 89998320 |
Range | 79992969 |
Interquartile range (IQR) | 40377008 |
Descriptive statistics
Standard deviation | 23231198.84 |
---|---|
Coefficient of variation (CV) | 0.4646246258 |
Kurtosis | -1.215843051 |
Mean | 49999930.15 |
Median Absolute Deviation (MAD) | 20180810 |
Skewness | 0.001782261476 |
Sum | 1.585847785 × 1012 |
Variance | 5.396885994 × 1014 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
84188296 | 3 | < 0.1% |
23360325 | 2 | < 0.1% |
63258940 | 2 | < 0.1% |
88028279 | 2 | < 0.1% |
89497699 | 2 | < 0.1% |
67256278 | 2 | < 0.1% |
13054220 | 2 | < 0.1% |
12187467 | 2 | < 0.1% |
16483323 | 2 | < 0.1% |
14612835 | 2 | < 0.1% |
Other values (30799) | 31696 |
Value | Count | Frequency (%) |
10005351 | 1 | |
10007015 | 1 | |
10013391 | 1 | |
10017778 | 1 | |
10020800 | 1 | |
10020816 | 1 | |
10021007 | 1 | |
10021080 | 1 | |
10023156 | 1 | |
10025366 | 1 |
Value | Count | Frequency (%) |
89998320 | 1 | |
89991874 | 1 | |
89990868 | 1 | |
89989668 | 1 | |
89989188 | 1 | |
89987323 | 1 | |
89986571 | 1 | |
89986235 | 1 | |
89980664 | 1 | |
89980579 | 1 |
age_nm
Real number (ℝ≥0)
Distinct | 58 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 61.97856039 |
Minimum | 21 |
---|---|
Maximum | 80 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 247.9 KiB |
Quantile statistics
Minimum | 21 |
---|---|
5-th percentile | 42 |
Q1 | 54 |
median | 63 |
Q3 | 71 |
95-th percentile | 78 |
Maximum | 80 |
Range | 59 |
Interquartile range (IQR) | 17 |
Descriptive statistics
Standard deviation | 11.14975541 |
---|---|
Coefficient of variation (CV) | 0.1798969731 |
Kurtosis | -0.4824289114 |
Mean | 61.97856039 |
Median Absolute Deviation (MAD) | 9 |
Skewness | -0.4307372338 |
Sum | 1965774 |
Variance | 124.3170457 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
67 | 1073 | 3.4% |
69 | 1054 | 3.3% |
72 | 1021 | 3.2% |
66 | 1011 | 3.2% |
71 | 1008 | 3.2% |
68 | 992 | 3.1% |
65 | 983 | 3.1% |
70 | 978 | 3.1% |
63 | 973 | 3.1% |
64 | 946 | 3.0% |
Other values (48) | 21678 |
Value | Count | Frequency (%) |
21 | 1 | < 0.1% |
23 | 2 | < 0.1% |
24 | 5 | < 0.1% |
26 | 3 | < 0.1% |
27 | 14 | < 0.1% |
28 | 14 | < 0.1% |
29 | 27 | |
30 | 44 | |
31 | 40 | |
32 | 50 |
Value | Count | Frequency (%) |
80 | 549 | |
79 | 701 | |
78 | 764 | |
77 | 813 | |
76 | 853 | |
75 | 920 | |
74 | 899 | |
73 | 930 | |
72 | 1021 | |
71 | 1008 |
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.8 MiB |
BE |
---|
Characters and Unicode
Total characters | 0 |
---|---|
Distinct characters | 0 |
Distinct categories | 0 ? |
Distinct scripts | 0 ? |
Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | BE |
---|---|
2nd row | BE |
3rd row | BE |
4th row | BE |
5th row | BE |
Common Values
Value | Count | Frequency (%) |
BE | 31717 |
Value | Count | Frequency (%) |
be | 31717 |
Most occurring characters
Value | Count | Frequency (%) |
No values found. |
Most occurring categories
Value | Count | Frequency (%) |
No values found. |
Most frequent character per category
Most occurring scripts
Value | Count | Frequency (%) |
No values found. |
Most frequent character per script
Most occurring blocks
Value | Count | Frequency (%) |
No values found. |
Most frequent character per block
ttm_type_cd
Real number (ℝ≥0)
Distinct | 4 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1.461456002 |
Minimum | 1 |
---|---|
Maximum | 5 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 247.9 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 1 |
median | 1 |
Q3 | 1 |
95-th percentile | 3 |
Maximum | 5 |
Range | 4 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 0.9626675331 |
---|---|
Coefficient of variation (CV) | 0.6587044236 |
Kurtosis | 3.230282628 |
Mean | 1.461456002 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 1.995395754 |
Sum | 46353 |
Variance | 0.9267287792 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=4)
Value | Count | Frequency (%) |
1 | 25199 | |
3 | 5481 | 17.3% |
5 | 879 | 2.8% |
2 | 158 | 0.5% |
Value | Count | Frequency (%) |
1 | 25199 | |
2 | 158 | 0.5% |
3 | 5481 | 17.3% |
5 | 879 | 2.8% |
Value | Count | Frequency (%) |
5 | 879 | 2.8% |
3 | 5481 | 17.3% |
2 | 158 | 0.5% |
1 | 25199 |
Distinct | 173 |
---|---|
Distinct (%) | 0.7% |
Missing | 6506 |
Missing (%) | 20.5% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 26.10495419 |
Minimum | 0 |
---|---|
Maximum | 272 |
Zeros | 745 |
Zeros (%) | 2.3% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 247.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 7 |
Q1 | 15 |
median | 23 |
Q3 | 33 |
95-th percentile | 54 |
Maximum | 272 |
Range | 272 |
Interquartile range (IQR) | 18 |
Descriptive statistics
Standard deviation | 17.09837172 |
---|---|
Coefficient of variation (CV) | 0.654985701 |
Kurtosis | 21.31600807 |
Mean | 26.10495419 |
Median Absolute Deviation (MAD) | 8 |
Skewness | 3.001671907 |
Sum | 658132 |
Variance | 292.3543154 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
21 | 1180 | 3.7% |
14 | 1140 | 3.6% |
15 | 1135 | 3.6% |
20 | 1011 | 3.2% |
22 | 1007 | 3.2% |
28 | 887 | 2.8% |
19 | 868 | 2.7% |
13 | 773 | 2.4% |
16 | 771 | 2.4% |
27 | 769 | 2.4% |
Other values (163) | 15670 | |
(Missing) | 6506 |
Value | Count | Frequency (%) |
0 | 745 | |
1 | 40 | 0.1% |
2 | 22 | 0.1% |
3 | 26 | 0.1% |
4 | 49 | 0.2% |
5 | 43 | 0.1% |
6 | 115 | 0.4% |
7 | 250 | 0.8% |
8 | 410 | |
9 | 386 |
Value | Count | Frequency (%) |
272 | 1 | |
262 | 1 | |
238 | 1 | |
236 | 1 | |
224 | 1 | |
222 | 1 | |
221 | 1 | |
215 | 1 | |
214 | 1 | |
208 | 1 |
Distinct | 52 |
---|---|
Distinct (%) | 28.9% |
Missing | 31537 |
Missing (%) | 99.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 18.14444444 |
Minimum | 0 |
---|---|
Maximum | 78 |
Zeros | 14 |
Zeros (%) | < 0.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 247.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 6 |
median | 13.5 |
Q3 | 27 |
95-th percentile | 50.05 |
Maximum | 78 |
Range | 78 |
Interquartile range (IQR) | 21 |
Descriptive statistics
Standard deviation | 15.88510501 |
---|---|
Coefficient of variation (CV) | 0.875480374 |
Kurtosis | 1.396895578 |
Mean | 18.14444444 |
Median Absolute Deviation (MAD) | 9 |
Skewness | 1.258983381 |
Sum | 3266 |
Variance | 252.3365611 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 14 | < 0.1% |
11 | 11 | < 0.1% |
6 | 8 | < 0.1% |
4 | 7 | < 0.1% |
8 | 7 | < 0.1% |
13 | 7 | < 0.1% |
5 | 7 | < 0.1% |
24 | 6 | < 0.1% |
19 | 6 | < 0.1% |
14 | 6 | < 0.1% |
Other values (42) | 101 | 0.3% |
(Missing) | 31537 |
Value | Count | Frequency (%) |
0 | 14 | |
1 | 4 | < 0.1% |
2 | 3 | < 0.1% |
3 | 5 | < 0.1% |
4 | 7 | |
5 | 7 | |
6 | 8 | |
7 | 5 | < 0.1% |
8 | 7 | |
9 | 4 | < 0.1% |
Value | Count | Frequency (%) |
78 | 1 | |
67 | 1 | |
65 | 1 | |
63 | 1 | |
62 | 1 | |
57 | 1 | |
56 | 1 | |
54 | 1 | |
51 | 1 | |
50 | 2 |
Distinct | 159 |
---|---|
Distinct (%) | 2.9% |
Missing | 26193 |
Missing (%) | 82.6% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 27.5923244 |
Minimum | 0 |
---|---|
Maximum | 266 |
Zeros | 121 |
Zeros (%) | 0.4% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 247.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 7 |
Q1 | 17 |
median | 24 |
Q3 | 32 |
95-th percentile | 53 |
Maximum | 266 |
Range | 266 |
Interquartile range (IQR) | 15 |
Descriptive statistics
Standard deviation | 22.99481522 |
---|---|
Coefficient of variation (CV) | 0.8333772425 |
Kurtosis | 32.63754422 |
Mean | 27.5923244 |
Median Absolute Deviation (MAD) | 8 |
Skewness | 4.844563596 |
Sum | 152420 |
Variance | 528.7615272 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
21 | 280 | 0.9% |
28 | 242 | 0.8% |
22 | 220 | 0.7% |
14 | 217 | 0.7% |
20 | 208 | 0.7% |
27 | 207 | 0.7% |
29 | 194 | 0.6% |
15 | 190 | 0.6% |
18 | 179 | 0.6% |
23 | 178 | 0.6% |
Other values (149) | 3409 | 10.7% |
(Missing) | 26193 |
Value | Count | Frequency (%) |
0 | 121 | |
1 | 22 | 0.1% |
2 | 11 | < 0.1% |
3 | 18 | 0.1% |
4 | 13 | < 0.1% |
5 | 23 | 0.1% |
6 | 44 | 0.1% |
7 | 56 | |
8 | 52 | |
9 | 65 |
Value | Count | Frequency (%) |
266 | 1 | |
245 | 1 | |
242 | 1 | |
230 | 1 | |
228 | 1 | |
227 | 1 | |
225 | 1 | |
224 | 1 | |
223 | 1 | |
214 | 1 |
Distinct | 95 |
---|---|
Distinct (%) | 10.5% |
Missing | 30813 |
Missing (%) | 97.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 23.33185841 |
Minimum | 0 |
---|---|
Maximum | 205 |
Zeros | 53 |
Zeros (%) | 0.2% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 247.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 9 |
median | 17 |
Q3 | 30 |
95-th percentile | 60 |
Maximum | 205 |
Range | 205 |
Interquartile range (IQR) | 21 |
Descriptive statistics
Standard deviation | 23.3409047 |
---|---|
Coefficient of variation (CV) | 1.000387723 |
Kurtosis | 15.1769184 |
Mean | 23.33185841 |
Median Absolute Deviation (MAD) | 10 |
Skewness | 3.118101876 |
Sum | 21092 |
Variance | 544.7978322 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 53 | 0.2% |
7 | 38 | 0.1% |
13 | 37 | 0.1% |
15 | 35 | 0.1% |
14 | 35 | 0.1% |
10 | 33 | 0.1% |
6 | 33 | 0.1% |
21 | 30 | 0.1% |
20 | 28 | 0.1% |
17 | 26 | 0.1% |
Other values (85) | 556 | 1.8% |
(Missing) | 30813 |
Value | Count | Frequency (%) |
0 | 53 | |
1 | 12 | < 0.1% |
2 | 6 | < 0.1% |
3 | 11 | < 0.1% |
4 | 15 | < 0.1% |
5 | 13 | < 0.1% |
6 | 33 | |
7 | 38 | |
8 | 26 | |
9 | 25 |
Value | Count | Frequency (%) |
205 | 1 | |
193 | 1 | |
183 | 1 | |
178 | 1 | |
152 | 1 | |
141 | 1 | |
134 | 1 | |
133 | 1 | |
126 | 1 | |
121 | 2 |
period
Real number (ℝ≥0)
Distinct | 48 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 25.48380994 |
Minimum | 1 |
---|---|
Maximum | 48 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 247.9 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 4 |
Q1 | 14 |
median | 26 |
Q3 | 37 |
95-th percentile | 47 |
Maximum | 48 |
Range | 47 |
Interquartile range (IQR) | 23 |
Descriptive statistics
Standard deviation | 13.65553626 |
---|---|
Coefficient of variation (CV) | 0.5358514402 |
Kurtosis | -1.170318625 |
Mean | 25.48380994 |
Median Absolute Deviation (MAD) | 12 |
Skewness | -0.04868959517 |
Sum | 808270 |
Variance | 186.4736706 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=48)
Value | Count | Frequency (%) |
39 | 835 | 2.6% |
47 | 816 | 2.6% |
45 | 802 | 2.5% |
48 | 799 | 2.5% |
46 | 795 | 2.5% |
34 | 780 | 2.5% |
31 | 779 | 2.5% |
25 | 762 | 2.4% |
36 | 749 | 2.4% |
27 | 742 | 2.3% |
Other values (38) | 23858 |
Value | Count | Frequency (%) |
1 | 198 | 0.6% |
2 | 563 | |
3 | 740 | |
4 | 557 | |
5 | 651 | |
6 | 704 | |
7 | 579 | |
8 | 627 | |
9 | 551 | |
10 | 638 |
Value | Count | Frequency (%) |
48 | 799 | |
47 | 816 | |
46 | 795 | |
45 | 802 | |
44 | 653 | |
43 | 733 | |
42 | 576 | |
41 | 353 | |
40 | 552 | |
39 | 835 |