I have some columns in my dataframe that look like:
total NaN 26-27 52-53 88-89 165 280 399 611 962 1407 1937
I would like to transform them into numerical values using a round-up:
total NaN 27 53 89 165 280 399 611 962 1407 1937
clearly, pd.to_numeric() does not work as 26-27 is an object. I can do it one by one, but is there an elegant and fast way to do the transformation?
score:3
In case you truly need a round up between two numbers, that could be out of order, you can do this:
df.total.str.split('-').apply(pd.Series).astype(float).max(axis=1).astype('Int64')
0 NaN
1 27
2 53
3 89
4 165
5 280
6 399
7 611
8 962
9 1407
10 1937
dtype: Int64
Similar question
- Transforming dataframe string categories to numbers
- How to remove numbers from string terms in a pandas dataframe
- Converting numbers to particular string format in a pandas DataFrame
- Converting string of numbers and letters to int/float in pandas dataframe
- Mapping string categories to numbers using pandas and numpy
- Remove Decimal Point in a Dataframe with both Numbers and String Using Python
- Extract only numbers and only string from pandas dataframe
- How to sum a dataframe column of lists of numbers in string form?
- Using regex to strip numbers inside curly braces at the start of the string in a Pandas dataframe
- Python Pandas : remove max in categories of dataframe from string comment in another dataframe
score:0
Yet another solution with regex:
df.total.str.replace(r"\d+\-","").astype(float)
score:2
You can separate it according to the -
sign and take the last element with split
. This allows you to convert the data to float
and then to integer
if you like.
>>> df.total.str.split('-').str[-1].astype(float)
0 NaN
1 27.0
2 53.0
3 89.0
4 165.0
5 280.0
6 399.0
7 611.0
8 962.0
9 1407.0
10 1937.0
Name: total, dtype: float64
Or if you want to cast to integer,
>>> df.total.str.split('-').str[-1].astype(float).astype('Int64')
0 NaN
1 27
2 53
3 89
4 165
5 280
6 399
7 611
8 962
9 1407
10 1937
Name: total, dtype: Int64
score:6
IIUC, we can use a little bit of regex to extract all numbers grabbing the last element before a line terminator
Before \n
using $
\d+
matches a digit (equal to [0-9])
+
Quantifier — Matches between one and unlimited times, as many times as
df['total'].str.extract(r'(\d+)$').astype(float)
out:
0 NaN
1 27.0
2 53.0
3 89.0
4 165.0
5 280.0
6 399.0
7 611.0
8 962.0
9 1407.0
10 1937.0
Name: total, dtype: float64
Credit To: stackoverflow.com
Related Query
- Transforming column of string into columns of boolean indicators in pandas DataFrame
- Turning string currency numbers inside a dataframe into float using Python or Pandas
- Find numbers after a string "Quote" in a dataframe column
- Python Pandas DataFrame replace: strip string from trailing numbers
- Using zfill to pad a Colum of numbers based on a string in a column of a dataframe
- How to extract numbers from a complex string in a large python dataframe
- Efficient way to split string where individual words are separated by capitalizations or numbers and apply this to entire dataframe column
- Transforming string values in a dataframe using dictionary
- How to sum up combined string has serval numbers in a pandas DataFrame column
- How does one extract numbers from a string in a dataframe and add the multiple of these numbers in a new column of the same dataframe
- Convert DataFrame column type from string to datetime
- Create Pandas DataFrame from a string
- Split (explode) pandas dataframe string entry to separate rows
- How to split a dataframe string column into two columns?
- How to filter rows containing a string pattern from a Pandas dataframe
- How to display pandas DataFrame of floats using a format string for columns?
- Drop columns whose name contains a specific string from pandas DataFrame
- Print very long string completely in pandas dataframe
- Converting a column within pandas dataframe from int to string
- How to convert column with dtype as object to string in Pandas Dataframe
More Query from same tag
- Pandas string operations (extract and findall)
- Matplotlib Event Plot
- Pandas pivot table selecting rows with maximum values
- Change values in pandas dataframe to reflect desired output in the max value counts
- Python - Pandas Dataframe ignore \ when importing
- Dataframe print index using pyexcelerate
- how to get the difference between a column from two dataframes by getting their index from another dataframe?
- Using a combination of isin/transform/merge/groupby/map for conditional populating of dataframe column
- How to plot mensual boxplots for a specific column in a datetime indexed dataframe?
- Picking out values from pandas DataFrame with mixed delimiters
- Python: change periods to daily records
- Pandas groupby merge_asof
- combine rows into single cell
- Merging DataFrames and staying only with not matching entries
- How to load specific Excel sheets?