Tech TLDR;

  • Archive
  • Top Posts
  • GitHub
  • LinkedIn
  • Contact

Group Log Data by Timestamp in Python with Pandas

December 23, 2018 by admin

I often find myself with logs in the following format, exported in CSV.

timestamp,request_time_in_millis
1545509231863,101
1545509231863,38
1545541592109,202
1545595617945,53
1545595617945,1024

First column is timestamp in milliseconds, second column is timing for some request, also in milliseconds.

Often I’d like to see the min, max and average request time, group by day and hour. Here is how to get this insight using Pandas library in Python.

Let’s load sample data first.

import pandas as pd

# Generate sample data (normally would read this from a log file)
data = {
    'millis': [101,           38,            202,            53,            1024], 
    'time':   [1545509231863, 1545509231863, 1545541592109,  1545592017945, 1545592017945]
}
df = pd.DataFrame.from_dict(data)Code language: PHP (php)

At this point the data frame will look as follows:

   millis           time
0     101  1545564489895
1      38  1545564489895
2     202  1545590975516
3      53  1545592017945
4    1024  1545592017945

Now we can convert the timestamp to date and group by day and hour.

# Convert timestamp to datetime, note "unit" is set to "ms", 
# would be "s" timestamp in seconds
df['date'] = df['time'].apply(lambda x: pd.to_datetime(x, unit='ms'))

# Only keep columns we care about
df = df[['date', 'millis']]

# Prevent statistic view from splitting across terminal
pd.set_option('display.expand_frame_repr', False)

# See grouped statistics
df.groupby([df['date'].dt.day, df['date'].dt.hour]).describe()
Code language: PHP (php)

The final output, will be grouped by day of the month and hour of the day, as follows:

millis
           count   mean         std    min     25%    50%     75%     max
date date
22   20      2.0   69.5   44.547727   38.0   53.75   69.5   85.25   101.0
23   5       1.0  202.0         NaN  202.0  202.00  202.0  202.00   202.0
     19      2.0  538.5  686.600685   53.0  295.75  538.5  781.25  1024.0
Code language: CSS (css)

Filed Under: data, Python

Copyright © 2025 · eleven40 Pro Theme on Genesis Framework · WordPress · Log in