osquery
By traviscj
- 4 minutes read - 662 wordsI’ve known about [osquery][] for a while, but recently spent some time digging around in it. [osquery]: https://osquery.io The basic idea is to provide a consistent SQL interface to a bunch of system data, instead of learning idiosyncrasies of individual commands (which themselves vary across operating systems). My hacker buddy Sharvil has worked on osquery a fair bit and explained to me a couple years ago that it actually uses [sqlite][]’s [virtual table functionality][vtab] to provide the interface – it’s a fascinating and brilliant project! [sqlite]: https://sqlite.org/ [vtab]: https://sqlite.org/vtab.html
Concretely, instead of running [uptime
]uptime and getting either
osx ➜ uptime
8:48 up 17:27, 10 users, load averages: 4.57 4.74 5.08
linux ➜ uptime
16:52:12 up 192 days, 9:59, 1 user, load average: 0.00, 0.00, 0.00
you can run: [uptime]: https://linux.die.net/man/1/uptime
osquery> SELECT * FROM uptime;
+------+-------+---------+---------+---------------+
| days | hours | minutes | seconds | total_seconds |
+------+-------+---------+---------+---------------+
| 2 | 1 | 18 | 38 | 177518 |
+------+-------+---------+---------+---------------+
We can also query for system information (shown here in line
mode)
osquery> .mode line
osquery> SELECT * FROM system_info;
hostname = traviscj.local
uuid = [redacted]
cpu_type = x86_64h
cpu_subtype = Intel x86-64h Haswell
cpu_brand = Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
cpu_physical_cores = 4
cpu_logical_cores = 8
physical_memory = 17179869184
hardware_vendor = Apple Inc.
hardware_model = MacBookPro11,2
hardware_version = 1.0
hardware_serial = [redacted]
computer_name = traviscj
local_hostname = traviscj
and even perform HTTP requests with the curl
table:
osquery> SELECT url, response_code FROM curl WHERE url IN ('https://traviscj.com/_status.json', 'https://pauljoos.com/_status.json');
+-----------------------------------+---------------+
| url | response_code |
+-----------------------------------+---------------+
| https://pauljoos.com/_status.json | 200 |
| https://traviscj.com/_status.json | 200 |
+-----------------------------------+---------------+
I tried to reproduce a simple ps
wrapper I have:
#!/bin/sh
# a simple script to show java processes without extremely long classpaths
ps aux | grep '[b]in/java' | grep -v $0 | awk '{ print $2" "$11" "$NF }'
and ended up with
SELECT pid
, SUBSTR(cmdline,
(WITH char_index(i, ch) AS (values(1, '') UNION ALL SELECT i+1, substr(cmdline, i, 1) AS r FROM char_index WHERE r != '')
SELECT MAX(i) FROM char_index WHERE char_index.ch = ' ')
) as java_main_class
FROM processes WHERE name = 'java';
which produces
osquery> SELECT pid, SUBSTR(cmdline, (WITH char_index(i, ch) AS (
...> values(1, '')
...> UNION ALL SELECT i+1, substr(cmdline, i, 1) AS r FROM char_index WHERE r != ''
...> )
...> SELECT MAX(i) FROM char_index WHERE char_index.ch = ' ')) as java_main_class FROM processes WHERE name = 'java';
+-------+----------------------------+
| pid | java_main_class |
+-------+----------------------------+
| 2736 | com.company.app1.FirstApp |
| 2737 | com.company.app2.SecondApp |
+-------+----------------------------+
This is, to me, some world-class SQL trickery.
Stepping through, the WITH ... SELECT
clause works like this.
osquery> WITH char_index(i, ch) AS (
...> values(1, '')
...> UNION ALL SELECT i+1, SUBSTR("foo bar baz", i, 1) as r FROM char_index WHERE r != ''
...> ) SELECT * FROM char_index;
+----+----+
| i | ch |
+----+----+
| 1 | |
| 2 | f |
| 3 | o |
| 4 | o |
| 5 | |
| 6 | b |
| 7 | a |
| 8 | r |
| 9 | |
| 10 | b |
| 11 | a |
| 12 | z |
+----+----+
Adding the WHERE char_index.ch = ' '
gets us just the spaces:
osquery> WITH char_index(i, ch) AS (
...> values(1, '')
...> UNION ALL SELECT i+1, SUBSTR("foo bar baz", i, 1) as r FROM char_index WHERE r != ''
...> ) SELECT * FROM char_index WHERE char_index.ch = ' ';
+---+----+
| i | ch |
+---+----+
| 5 | |
| 9 | |
+---+----+
and of course adding MAX(i)
gives the highest such index, 9
.
My expression then passes that entire thing into a SUBSTR(var, idx)
clause to get the last element, and applies it to cmdline
instead of a static string.
There’s an extensive schema listing on OSQuery’s website describing all the tables available in the latest version of OSQuery.