I have some time series data as:
import pandas as pd index = pd.date_range('06/01/2014',periods=24*30,freq='H') df1 = pd.DataFrame(range(len(index)),index=index)
Now I want to subset data of below dates
selec_dates = ['2014-06-10','2014-06-15','2014-06-20']
I tried following statement but it is not working
sub_data = df1.loc[df1.index.isin(pd.to_datetime(selec_dates))]
Where am I doing wrong? Is there any other approach to subset selected days data?
score:2
You can use .query() method:
In [202]: df1.query('@index.normalize() in @selec_dates')
Out[202]:
0
2014-06-10 00:00:00 216
2014-06-10 01:00:00 217
2014-06-10 02:00:00 218
2014-06-10 03:00:00 219
2014-06-10 04:00:00 220
2014-06-10 05:00:00 221
2014-06-10 06:00:00 222
2014-06-10 07:00:00 223
2014-06-10 08:00:00 224
2014-06-10 09:00:00 225
... ...
2014-06-20 14:00:00 470
2014-06-20 15:00:00 471
2014-06-20 16:00:00 472
2014-06-20 17:00:00 473
2014-06-20 18:00:00 474
2014-06-20 19:00:00 475
2014-06-20 20:00:00 476
2014-06-20 21:00:00 477
2014-06-20 22:00:00 478
2014-06-20 23:00:00 479
[72 rows x 1 columns]
Similar question
- Subset selected days data in Python
- Create hash value for each row of data with selected columns in dataframe in python pandas
- Python - Retrieving last 30 days data from dataframe pandas
- Process subset of data based on variable type in python
- Python - Subset data between two datetime
- Calculation of days until next and since last holiday based on date in Data Frame in Python Pandas?
- Assigning selected values of a column of given data frame to another data frame in Python
- Concatenating selected columns from two data frames in python pandas
- Python Dash: Return subset of a data frame using drop-down menu
- Python subset a data frame based on a variable name
score:1
Use the string repr of the date, leaving out the time periods in the day.
pd.concat([df1['2014-06-10'] , df1['2014-06-15'], df1['2014-06-20']])
score:2
Edit: I have been made aware this only works if you are working with a daterange in the same month and year as in your query. For a more general (and better answer) see @jezrael solution.
You can use np.in1d
and .day
on your index if you wanted to do it as you tried:
selec_dates = ['2014-06-10','2014-06-15','2014-06-20']
df1.loc[np.in1d(df1.index.day, (pd.to_datetime(selec_dates).day))]
This gives you as you require:
2014-06-10 00:00:00 216
2014-06-10 01:00:00 217
2014-06-10 02:00:00 218
2014-06-10 03:00:00 219
2014-06-10 04:00:00 220
2014-06-10 05:00:00 221
2014-06-10 06:00:00 222
2014-06-10 07:00:00 223
2014-06-10 08:00:00 224
2014-06-10 09:00:00 225
2014-06-10 10:00:00 226
2014-06-10 11:00:00 227
2014-06-10 12:00:00 228
2014-06-10 13:00:00 229
2014-06-10 14:00:00 230
2014-06-10 15:00:00 231
2014-06-10 16:00:00 232
2014-06-10 17:00:00 233
2014-06-10 18:00:00 234
2014-06-10 19:00:00 235
2014-06-10 20:00:00 236
2014-06-10 21:00:00 237
2014-06-10 22:00:00 238
2014-06-10 23:00:00 239
2014-06-15 00:00:00 336
2014-06-15 01:00:00 337
2014-06-15 02:00:00 338
2014-06-15 03:00:00 339
2014-06-15 04:00:00 340
2014-06-15 05:00:00 341
...
2014-06-15 18:00:00 354
2014-06-15 19:00:00 355
2014-06-15 20:00:00 356
2014-06-15 21:00:00 357
2014-06-15 22:00:00 358
2014-06-15 23:00:00 359
2014-06-20 00:00:00 456
2014-06-20 01:00:00 457
2014-06-20 02:00:00 458
2014-06-20 03:00:00 459
2014-06-20 04:00:00 460
2014-06-20 05:00:00 461
2014-06-20 06:00:00 462
2014-06-20 07:00:00 463
2014-06-20 08:00:00 464
2014-06-20 09:00:00 465
2014-06-20 10:00:00 466
2014-06-20 11:00:00 467
2014-06-20 12:00:00 468
2014-06-20 13:00:00 469
2014-06-20 14:00:00 470
2014-06-20 15:00:00 471
2014-06-20 16:00:00 472
2014-06-20 17:00:00 473
2014-06-20 18:00:00 474
2014-06-20 19:00:00 475
2014-06-20 20:00:00 476
2014-06-20 21:00:00 477
2014-06-20 22:00:00 478
2014-06-20 23:00:00 479
[72 rows x 1 columns]
I used these Sources for this answer:
- Selecting a subset of a Pandas DataFrame indexed by DatetimeIndex with a list of TimeStamps
- In Python-Pandas, How can I subset a dataframe by specific datetime index values?
- return pandas DF column with the number of days elapsed between index and today's date
- Get weekday/day-of-week for Datetime column of DataFrame
- https://stackoverflow.com/a/36893416/2254228
score:3
I'm sorry and misunderstood your question
df1[pd.Series(df1.index.date, index=df1.index).isin(pd.to_datetime(selec_dates).date)]
Should perform what was needed
original answer
Please check the pandas
documentation on selection
You can easily do
sub_data = df1.loc[pd.to_datetime(selec_dates)]
score:6
You need compare dates
and for test membership use numpy.in1d
:
sub_data = df1.loc[np.in1d(df1.index.date, pd.to_datetime(selec_dates).date)]
print (sub_data)
a
2014-06-10 00:00:00 216
2014-06-10 01:00:00 217
2014-06-10 02:00:00 218
2014-06-10 03:00:00 219
2014-06-10 04:00:00 220
2014-06-10 05:00:00 221
2014-06-10 06:00:00 222
2014-06-10 07:00:00 223
2014-06-10 08:00:00 224
2014-06-10 09:00:00 225
2014-06-10 10:00:00 226
...
If want use isin
, is necessary create Series
with same index:
sub_data = df1.loc[pd.Series(df1.index.date, index=df1.index)
.isin(pd.to_datetime(selec_dates).date)]
print (sub_data)
a
2014-06-10 00:00:00 216
2014-06-10 01:00:00 217
2014-06-10 02:00:00 218
2014-06-10 03:00:00 219
2014-06-10 04:00:00 220
2014-06-10 05:00:00 221
2014-06-10 06:00:00 222
2014-06-10 07:00:00 223
2014-06-10 08:00:00 224
2014-06-10 09:00:00 225
2014-06-10 10:00:00 226
2014-06-10 11:00:00 227
...
Credit To: stackoverflow.com
Related Query
- aggregate irregular data to monthly averages weighing for days in months,using python
- Pandas subset randomly selected number of rows from dataframe based on values in another data
- Python data manipulation: How to group by ID, then subset rows forward/back 1 month base on datatime conditions in each group?
- How to pull EOD stock data from yahoo finance for excatly last 20 WORKING Days using Pandas in Python 2.7
- How do I randomly sample a set of groups in a Python Data Frame, keeping all records of each selected group, but WITHOUT merging?
- Subset data in python using another dataframe
- Performing calculations on subset of data frame subset in Python
- How to calculate the days difference between all the dates of a dataframe column and a single data in Python
- Roll data forward in python keeping the days consistent
- How to select the latest records from time series data for each patient id in python and we have to merge the hourly data into days
- How to Normalize or standardize specific or selected features of a data set in python
- Python Pandas Error tokenizing data
- Python pandas Filtering out nan from a data selection of a column of strings
- Python Pandas - Find difference between two data frames
- unique combinations of values in selected columns in pandas data frame and count
- Python Pandas add column for row-wise max value of selected columns
- Python "TypeError: unhashable type: 'slice'" for encoding categorical data
- How to handle incoming real time data with python pandas
- Python Pandas Group by date using datetime data
- Filter data frame based on index value in Python
More Query from same tag
- How to add several dataframes to each other and find the number of rows that are repeats?
- How to search a substring from one df in another df?
- Convert dataframe to series for multiple column
- Using np.where to create a new column in pandas df based on conditions
- Appending a list to the top of Pandas DataFrame output
- Find all ids that have 2 specific values for a one column
- how to solve 'list' object has no attribute 'apply'
- pandas assign with new column name as string
- Nested tree DataFrame reshaping
- Which version of duplicate feature columns removal in machine learning is faster and why?
- pandas add() two dataframes with parcially overlapping indexes?
- How do you write data in a specific cell in an excel file without overwriting data (using pandas)?
- Python pandas nlargest() not working properly with keep = 'all'
- Pandas: new rows using time windowed or shifted variables
- Pandas: Create a column where row is equal to the row below in another column