Dataset statistics
Number of variables | 13 |
---|---|
Number of observations | 2983 |
Missing cells | 15487 |
Missing cells (%) | 39.9% |
Total size in memory | 451.7 KiB |
Average record size in memory | 155.0 B |
Variable types
Numeric | 10 |
---|---|
Unsupported | 2 |
Categorical | 1 |
country_cd has constant value "LV" | Constant |
time_dx_to_immunotherapy_nm has constant value "1035.0" | Constant |
socecon_lvl_cd has 2983 (100.0%) missing values | Missing |
country_origin_cd has 2983 (100.0%) missing values | Missing |
time_dx_to_surgery_nm has 117 (3.9%) missing values | Missing |
time_dx_to_radiotherapy_nm has 1654 (55.4%) missing values | Missing |
time_dx_to_chemotherapy_nm has 2074 (69.5%) missing values | Missing |
time_dx_to_immunotherapy_nm has 2982 (> 99.9%) missing values | Missing |
time_dx_to_hormonotherapy_nm has 2694 (90.3%) missing values | Missing |
patient_id has unique values | Unique |
socecon_lvl_cd is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
country_origin_cd is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
time_dx_to_surgery_nm has 1167 (39.1%) zeros | Zeros |
time_dx_to_chemotherapy_nm has 76 (2.5%) zeros | Zeros |
Reproduction
Analysis started | 2022-06-08 06:41:45.164902 |
---|---|
Analysis finished | 2022-06-08 06:41:45.355088 |
Duration | 0.19 seconds |
Software version | pandas-profiling v3.2.0 |
Download configuration | config.json |
Distinct | 2983 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1480195.818 |
Minimum | 217430 |
---|---|
Maximum | 1890273 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 23.4 KiB |
Quantile statistics
Minimum | 217430 |
---|---|
5-th percentile | 373107.4 |
Q1 | 1012769 |
median | 1759273 |
Q3 | 1822446.5 |
95-th percentile | 1874265.5 |
Maximum | 1890273 |
Range | 1672843 |
Interquartile range (IQR) | 809677.5 |
Descriptive statistics
Standard deviation | 495571.303 |
---|---|
Coefficient of variation (CV) | 0.3348011776 |
Kurtosis | -0.03321233942 |
Mean | 1480195.818 |
Median Absolute Deviation (MAD) | 83815 |
Skewness | -1.192814676 |
Sum | 4415424125 |
Variance | 2.455909164 × 1011 |
Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
217430 | 1 | < 0.1% |
1815916 | 1 | < 0.1% |
1815648 | 1 | < 0.1% |
1815651 | 1 | < 0.1% |
1815663 | 1 | < 0.1% |
1815680 | 1 | < 0.1% |
1815822 | 1 | < 0.1% |
1815823 | 1 | < 0.1% |
1815824 | 1 | < 0.1% |
1815826 | 1 | < 0.1% |
Other values (2973) | 2973 |
Value | Count | Frequency (%) |
217430 | 1 | |
222679 | 1 | |
223180 | 1 | |
224137 | 1 | |
228560 | 1 | |
228659 | 1 | |
231280 | 1 | |
231623 | 1 | |
233619 | 1 | |
233639 | 1 |
Value | Count | Frequency (%) |
1890273 | 1 | |
1889974 | 1 | |
1889411 | 1 | |
1889407 | 1 | |
1889108 | 1 | |
1889007 | 1 | |
1888925 | 1 | |
1888840 | 1 | |
1888663 | 1 | |
1888620 | 1 |
age_nm
Real number (ℝ≥0)
Distinct | 59 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 59.91451559 |
Minimum | 22 |
---|---|
Maximum | 80 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 23.4 KiB |
Quantile statistics
Minimum | 22 |
---|---|
5-th percentile | 40 |
Q1 | 51 |
median | 61 |
Q3 | 69 |
95-th percentile | 78 |
Maximum | 80 |
Range | 58 |
Interquartile range (IQR) | 18 |
Descriptive statistics
Standard deviation | 11.75146724 |
---|---|
Coefficient of variation (CV) | 0.196137232 |
Kurtosis | -0.4491068627 |
Mean | 59.91451559 |
Median Absolute Deviation (MAD) | 9 |
Skewness | -0.3464300774 |
Sum | 178725 |
Variance | 138.0969824 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
68 | 122 | 4.1% |
66 | 114 | 3.8% |
62 | 112 | 3.8% |
58 | 110 | 3.7% |
56 | 103 | 3.5% |
60 | 99 | 3.3% |
64 | 87 | 2.9% |
63 | 83 | 2.8% |
59 | 81 | 2.7% |
50 | 78 | 2.6% |
Other values (49) | 1994 |
Value | Count | Frequency (%) |
22 | 2 | 0.1% |
23 | 2 | 0.1% |
24 | 3 | |
25 | 2 | 0.1% |
26 | 2 | 0.1% |
27 | 4 | |
28 | 2 | 0.1% |
29 | 5 | |
30 | 5 | |
31 | 6 |
Value | Count | Frequency (%) |
80 | 50 | |
79 | 52 | |
78 | 66 | |
77 | 56 | |
76 | 62 | |
75 | 72 | |
74 | 73 | |
73 | 61 | |
72 | 67 | |
71 | 66 |
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 172.0 KiB |
LV |
---|
Characters and Unicode
Total characters | 5966 |
---|---|
Distinct characters | 2 |
Distinct categories | 1 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | LV |
---|---|
2nd row | LV |
3rd row | LV |
4th row | LV |
5th row | LV |
Common Values
Value | Count | Frequency (%) |
LV | 2983 |
Category Frequency Plot
Value | Count | Frequency (%) |
lv | 2983 |
Most occurring characters
Value | Count | Frequency (%) |
L | 2983 | |
V | 2983 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 5966 |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
L | 2983 | |
V | 2983 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 5966 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
L | 2983 | |
V | 2983 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 5966 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
L | 2983 | |
V | 2983 |
hospital_id
Real number (ℝ≥0)
Distinct | 13 |
---|---|
Distinct (%) | 0.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2752.313778 |
Minimum | 190 |
---|---|
Maximum | 4182 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 23.4 KiB |
Quantile statistics
Minimum | 190 |
---|---|
5-th percentile | 351 |
Q1 | 351 |
median | 3943 |
Q3 | 3943 |
95-th percentile | 3943 |
Maximum | 4182 |
Range | 3992 |
Interquartile range (IQR) | 3592 |
Descriptive statistics
Standard deviation | 1590.829292 |
---|---|
Coefficient of variation (CV) | 0.5779970672 |
Kurtosis | -1.335568712 |
Mean | 2752.313778 |
Median Absolute Deviation (MAD) | 0 |
Skewness | -0.7252819208 |
Sum | 8210152 |
Variance | 2530737.836 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=13)
Value | Count | Frequency (%) |
3943 | 1807 | |
351 | 840 | |
2032 | 193 | 6.5% |
2806 | 120 | 4.0% |
3186 | 10 | 0.3% |
4180 | 3 | 0.1% |
497 | 3 | 0.1% |
4182 | 2 | 0.1% |
2816 | 1 | < 0.1% |
535 | 1 | < 0.1% |
Other values (3) | 3 | 0.1% |
Value | Count | Frequency (%) |
190 | 1 | < 0.1% |
351 | 840 | |
497 | 3 | 0.1% |
535 | 1 | < 0.1% |
1696 | 1 | < 0.1% |
1923 | 1 | < 0.1% |
2032 | 193 | 6.5% |
2806 | 120 | 4.0% |
2816 | 1 | < 0.1% |
3186 | 10 | 0.3% |
Value | Count | Frequency (%) |
4182 | 2 | 0.1% |
4180 | 3 | 0.1% |
3943 | 1807 | |
3186 | 10 | 0.3% |
2816 | 1 | < 0.1% |
2806 | 120 | 4.0% |
2032 | 193 | 6.5% |
1923 | 1 | < 0.1% |
1696 | 1 | < 0.1% |
535 | 1 | < 0.1% |
ttm_type_cd
Real number (ℝ≥0)
Distinct | 4 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1.209185384 |
Minimum | 1 |
---|---|
Maximum | 5 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 23.4 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 1 |
median | 1 |
Q3 | 1 |
95-th percentile | 3 |
Maximum | 5 |
Range | 4 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 0.63948659 |
---|---|
Coefficient of variation (CV) | 0.5288573601 |
Kurtosis | 9.481374767 |
Mean | 1.209185384 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 3.101984315 |
Sum | 3607 |
Variance | 0.4089430988 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=4)
Value | Count | Frequency (%) |
1 | 2671 | |
3 | 261 | 8.7% |
2 | 34 | 1.1% |
5 | 17 | 0.6% |
Value | Count | Frequency (%) |
1 | 2671 | |
2 | 34 | 1.1% |
3 | 261 | 8.7% |
5 | 17 | 0.6% |
Value | Count | Frequency (%) |
5 | 17 | 0.6% |
3 | 261 | 8.7% |
2 | 34 | 1.1% |
1 | 2671 |
Distinct | 199 |
---|---|
Distinct (%) | 6.9% |
Missing | 117 |
Missing (%) | 3.9% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 20.4867411 |
Minimum | 0 |
---|---|
Maximum | 1365 |
Zeros | 1167 |
Zeros (%) | 39.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 23.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 1 |
Q3 | 1 |
95-th percentile | 156 |
Maximum | 1365 |
Range | 1365 |
Interquartile range (IQR) | 1 |
Descriptive statistics
Standard deviation | 75.51028962 |
---|---|
Coefficient of variation (CV) | 3.685812655 |
Kurtosis | 96.49901427 |
Mean | 20.4867411 |
Median Absolute Deviation (MAD) | 1 |
Skewness | 7.905235901 |
Sum | 58715 |
Variance | 5701.803838 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 1167 | |
1 | 1140 | |
2 | 50 | 1.7% |
3 | 32 | 1.1% |
4 | 15 | 0.5% |
6 | 11 | 0.4% |
14 | 10 | 0.3% |
15 | 10 | 0.3% |
5 | 10 | 0.3% |
12 | 9 | 0.3% |
Other values (189) | 412 | 13.8% |
(Missing) | 117 | 3.9% |
Value | Count | Frequency (%) |
0 | 1167 | |
1 | 1140 | |
2 | 50 | 1.7% |
3 | 32 | 1.1% |
4 | 15 | 0.5% |
5 | 10 | 0.3% |
6 | 11 | 0.4% |
7 | 5 | 0.2% |
8 | 7 | 0.2% |
10 | 3 | 0.1% |
Value | Count | Frequency (%) |
1365 | 1 | |
1308 | 1 | |
960 | 1 | |
955 | 1 | |
729 | 1 | |
723 | 1 | |
679 | 1 | |
676 | 1 | |
614 | 1 | |
589 | 1 |
Distinct | 360 |
---|---|
Distinct (%) | 27.1% |
Missing | 1654 |
Missing (%) | 55.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 181.5221971 |
Minimum | 0 |
---|---|
Maximum | 1683 |
Zeros | 11 |
Zeros (%) | 0.4% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 23.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 37 |
Q1 | 84 |
median | 167 |
Q3 | 237 |
95-th percentile | 335 |
Maximum | 1683 |
Range | 1683 |
Interquartile range (IQR) | 153 |
Descriptive statistics
Standard deviation | 156.8715939 |
---|---|
Coefficient of variation (CV) | 0.8642006124 |
Kurtosis | 30.79096196 |
Mean | 181.5221971 |
Median Absolute Deviation (MAD) | 77 |
Skewness | 4.423784909 |
Sum | 241243 |
Variance | 24608.69698 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
49 | 15 | 0.5% |
78 | 14 | 0.5% |
238 | 13 | 0.4% |
63 | 13 | 0.4% |
42 | 13 | 0.4% |
224 | 12 | 0.4% |
76 | 12 | 0.4% |
91 | 11 | 0.4% |
244 | 11 | 0.4% |
43 | 11 | 0.4% |
Other values (350) | 1204 | |
(Missing) | 1654 |
Value | Count | Frequency (%) |
0 | 11 | |
5 | 2 | 0.1% |
6 | 1 | < 0.1% |
7 | 4 | 0.1% |
8 | 1 | < 0.1% |
10 | 2 | 0.1% |
17 | 1 | < 0.1% |
19 | 1 | < 0.1% |
22 | 1 | < 0.1% |
23 | 1 | < 0.1% |
Value | Count | Frequency (%) |
1683 | 1 | |
1619 | 1 | |
1475 | 1 | |
1460 | 1 | |
1456 | 1 | |
1355 | 1 | |
1332 | 1 | |
1219 | 1 | |
1126 | 1 | |
1118 | 1 |
Distinct | 193 |
---|---|
Distinct (%) | 21.2% |
Missing | 2074 |
Missing (%) | 69.5% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 74.82728273 |
Minimum | 0 |
---|---|
Maximum | 1743 |
Zeros | 76 |
Zeros (%) | 2.5% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 23.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 23 |
median | 41 |
Q3 | 68 |
95-th percentile | 172 |
Maximum | 1743 |
Range | 1743 |
Interquartile range (IQR) | 45 |
Descriptive statistics
Standard deviation | 158.9419445 |
---|---|
Coefficient of variation (CV) | 2.124117551 |
Kurtosis | 42.78523831 |
Mean | 74.82728273 |
Median Absolute Deviation (MAD) | 21 |
Skewness | 6.084945686 |
Sum | 68018 |
Variance | 25262.54172 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 76 | 2.5% |
30 | 22 | 0.7% |
1 | 21 | 0.7% |
36 | 19 | 0.6% |
31 | 18 | 0.6% |
23 | 17 | 0.6% |
50 | 17 | 0.6% |
51 | 16 | 0.5% |
42 | 16 | 0.5% |
28 | 15 | 0.5% |
Other values (183) | 672 | 22.5% |
(Missing) | 2074 |
Value | Count | Frequency (%) |
0 | 76 | |
1 | 21 | 0.7% |
2 | 8 | 0.3% |
3 | 2 | 0.1% |
4 | 9 | 0.3% |
5 | 10 | 0.3% |
6 | 5 | 0.2% |
7 | 12 | 0.4% |
8 | 6 | 0.2% |
10 | 5 | 0.2% |
Value | Count | Frequency (%) |
1743 | 1 | |
1449 | 1 | |
1315 | 1 | |
1295 | 1 | |
1278 | 1 | |
1222 | 1 | |
1188 | 1 | |
1078 | 1 | |
982 | 1 | |
938 | 1 |
Distinct | 1 |
---|---|
Distinct (%) | 100.0% |
Missing | 2982 |
Missing (%) | > 99.9% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1035 |
Minimum | 1035 |
---|---|
Maximum | 1035 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 23.4 KiB |
Quantile statistics
Minimum | 1035 |
---|---|
5-th percentile | 1035 |
Q1 | 1035 |
median | 1035 |
Q3 | 1035 |
95-th percentile | 1035 |
Maximum | 1035 |
Range | 0 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | nan |
---|---|
Coefficient of variation (CV) | nan |
Kurtosis | nan |
Mean | 1035 |
Median Absolute Deviation (MAD) | 0 |
Skewness | nan |
Sum | 1035 |
Variance | nan |
Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=1)
Value | Count | Frequency (%) |
1035 | 1 | < 0.1% |
(Missing) | 2982 |
Value | Count | Frequency (%) |
1035 | 1 |
Value | Count | Frequency (%) |
1035 | 1 |
Distinct | 221 |
---|---|
Distinct (%) | 76.5% |
Missing | 2694 |
Missing (%) | 90.3% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 227.0103806 |
Minimum | 0 |
---|---|
Maximum | 1441 |
Zeros | 1 |
Zeros (%) | < 0.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 23.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 22.8 |
Q1 | 91 |
median | 159 |
Q3 | 293 |
95-th percentile | 627.6 |
Maximum | 1441 |
Range | 1441 |
Interquartile range (IQR) | 202 |
Descriptive statistics
Standard deviation | 221.2526284 |
---|---|
Coefficient of variation (CV) | 0.9746366127 |
Kurtosis | 7.125520242 |
Mean | 227.0103806 |
Median Absolute Deviation (MAD) | 91 |
Skewness | 2.357220451 |
Sum | 65606 |
Variance | 48952.72559 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
154 | 5 | 0.2% |
158 | 3 | 0.1% |
103 | 3 | 0.1% |
56 | 3 | 0.1% |
97 | 3 | 0.1% |
77 | 3 | 0.1% |
159 | 3 | 0.1% |
39 | 3 | 0.1% |
188 | 3 | 0.1% |
144 | 3 | 0.1% |
Other values (211) | 257 | 8.6% |
(Missing) | 2694 |
Value | Count | Frequency (%) |
0 | 1 | |
2 | 1 | |
7 | 1 | |
8 | 1 | |
14 | 2 | |
15 | 1 | |
17 | 2 | |
18 | 1 | |
19 | 1 | |
21 | 2 |
Value | Count | Frequency (%) |
1441 | 1 | |
1315 | 1 | |
1089 | 1 | |
1073 | 1 | |
1031 | 1 | |
983 | 1 | |
969 | 1 | |
921 | 1 | |
880 | 1 | |
827 | 1 |
period
Real number (ℝ≥0)
Distinct | 49 |
---|---|
Distinct (%) | 1.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 18.48642306 |
Minimum | 1 |
---|---|
Maximum | 62 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 23.4 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 3 |
Q1 | 10 |
median | 18 |
Q3 | 27 |
95-th percentile | 35 |
Maximum | 62 |
Range | 61 |
Interquartile range (IQR) | 17 |
Descriptive statistics
Standard deviation | 10.44719725 |
---|---|
Coefficient of variation (CV) | 0.5651281056 |
Kurtosis | -1.037624847 |
Mean | 18.48642306 |
Median Absolute Deviation (MAD) | 9 |
Skewness | 0.09274071826 |
Sum | 55145 |
Variance | 109.1439303 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=49)
Value | Count | Frequency (%) |
13 | 115 | 3.9% |
8 | 113 | 3.8% |
10 | 101 | 3.4% |
20 | 99 | 3.3% |
22 | 95 | 3.2% |
25 | 95 | 3.2% |
34 | 94 | 3.2% |
27 | 94 | 3.2% |
11 | 92 | 3.1% |
15 | 91 | 3.1% |
Other values (39) | 1994 |
Value | Count | Frequency (%) |
1 | 85 | |
2 | 61 | |
3 | 84 | |
4 | 82 | |
5 | 82 | |
6 | 84 | |
7 | 59 | |
8 | 113 | |
9 | 88 | |
10 | 101 |
Value | Count | Frequency (%) |
62 | 1 | < 0.1% |
51 | 1 | < 0.1% |
49 | 1 | < 0.1% |
46 | 1 | < 0.1% |
45 | 1 | < 0.1% |
44 | 1 | < 0.1% |
43 | 1 | < 0.1% |
42 | 1 | < 0.1% |
41 | 3 | |
40 | 4 |