Log Parser

Question: Log files are used ubiquitously in programming because they give insight on what the program is doing during its execution. Instead of searching through many log files on the filesystem with custom commands, we can load the data into kdb+ and query the logs in a standardized (and faster) manner. Log files can differ in format from program to program, so it is not unusal to write a custom log parser based on the log format. Given a log format of '%(asctime)s %(name)s %(levelname)s:%(message)s', define a function 'lp' that takes in a log file, represented as a file symbol, and returns a table with columns date, time, level, message.

Example

                                
                                $ cat sample.log
2019-03-27 21:16:41,200 INFO:starting function
2019-03-27 21:16:41,200 INFO:this file has 10 rows
2019-03-27 21:16:41,201 DEBUG:function completed for myfile.txt
2019-03-27 21:16:41,201 INFO:starting function
2019-03-27 21:16:41,202 ERROR:[Errno 2] No such file or directory: 'nonexistentfile.txt'
2019-03-27 21:16:41,202 DEBUG:function completed for nonexistentfile.txt

q)lp `:sample.log
date       time         level msg
------------------------------------------------------------------------------------------
2019.03.27 21:16:41.200 INFO  "starting function"
2019.03.27 21:16:41.200 INFO  "this file has 10 rows"
2019.03.27 21:16:41.201 DEBUG "function completed for myfile.txt"
2019.03.27 21:16:41.201 INFO  "starting function"
2019.03.27 21:16:41.202 ERROR "[Errno 2] No such file or directory: 'nonexistentfile.txt'"
2019.03.27 21:16:41.202 DEBUG "function completed for nonexistentfile.txt"
q)meta lp `:sample.log
c    | t f a
-----| -----
date | d
time | t
level| s
msg  | C
                                
                            

Solution

Tags:
dictionaries iterators strings tables
Searchable Tags
algorithms api architecture asynchronous c csv data structures dictionaries disk feedhandler finance functions ingestion ipc iterators machine learning math multithreading optimizations realtime shared library sql statistics streaming strings tables temporal utility websockets