-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO
431 lines (391 loc) · 29.4 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
top:
- Boringssl build - is this really a good idea? -DCMAKE_POSITION_INDEPENDENT_CODE:BOOL=OFF
- Fix CN/SAN validation failures with recent versions of boringssl
- Failures:
- 47/369 Test #47: ConnectionPoolTest.CompatibleServerCertificate .............................................................................................................***Failed 0.06 sec
- 52/369 Test #52: TLSSocketTest.ServerSNI ....................................................................................................................................***Failed 1.06 sec
- 53/369 Test #53: TLSSocketTest.MutualTLSHappyPath ...........................................................................................................................***Failed 1.04 sec
- Did the validation logic change or is it an issue with the SSL context swap?
- All test failures look like handshake failures:
- [1722966253:56018:INF] [test-client:127.0.0.1:47670>127.0.0.1:38503,7] cannot complete client TLS handshake (1): error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED [/home/jblatt/src/duderino/everscale/base/source/ESBClientTLSSocket.cpp:121]
- [1722966253:56019:INF] [sec-echo-serve:127.0.0.1:38503>127.0.0.1:47670,8] cannot complete server TLS handshake (1): error:10000412:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_CERTIFICATE [/home/jblatt/src/duderino/everscale/base/source/ESBServerTLSSocket.cpp:73]
- rename HttpTestParams HttpArguments or similar
- add counters to detect runaway epoll events (e.g., writable notifications when there is no data to send)
- put detection of remote close in epoll handler and call handleRemoteClose()? audit how client/server socket handle 0 reads. Should those also call remote close? Or just put the remote close logic only there?
[1613321028:18967:ERR] [orig-server:127.0.0.1:35641>127.0.0.1:37400,578] missing 1373526 bytes from 1373526 byte response body [/root/project/http/origin/source/ESHttpOriginHandler.cpp:92]
[1613321028:18967:CRT] [Aborted: 1/14]: gsignal
[1613321028:18967:CRT] [Aborted: 2/14]: abort
[1613321028:18967:CRT] [Aborted: 3/14]: /lib/x86_64-linux-gnu/libc.so.6(+0x3048a) [0x7ff2a107548a]
[1613321028:18967:CRT] [Aborted: 4/14]: /lib/x86_64-linux-gnu/libc.so.6(+0x30502) [0x7ff2a1075502]
[1613321028:18967:CRT] [Aborted: 5/14]: ES::HttpOriginHandler::consumeRequestBody(ES::HttpMultiplexer&, ES::HttpServerStream&, unsigned char const*, unsigned int, unsigned int*)
[1613321028:18967:CRT] [Aborted: 6/14]: ES::HttpServerSocket::stateReceiveRequestBody(ES::HttpServerHandler&)
[1613321028:18967:CRT] [Aborted: 7/14]: ES::HttpServerSocket::advanceStateMachine(ES::HttpServerHandler&, int)
[1613321028:18967:CRT] [Aborted: 8/14]: ES::HttpServerSocket::handleReadable()
[1613321028:18967:CRT] [Aborted: 9/14]: ESB::EpollMultiplexer::run(ESB::SharedInt*)
[1613321028:18967:CRT] [Aborted:10/14]: ES::HttpProxyMultiplexer::run(ESB::SharedInt*)
[1613321028:18967:CRT] [Aborted:11/14]: ESB::ThreadPoolWorker::run()
[1613321028:18967:CRT] [Aborted:12/14]: ESB::Thread::ThreadEntry(void*)
[1613321028:18967:CRT] [Aborted:13/14]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7ff2a185d6db]
[1613321028:18967:CRT] [Aborted:14/14]: clone
- Some change between de75ff65ea4c2ae866da5557c5da579b726c6759 and b6133cdecbb347431451c293fca62704f09c8bd0 reduced the connection reuse rate. Check updated connection pool hash?
de75ff65ea4c2ae866da5557c5da579b726c6759:
[1595691738:14892:WRN] SERVER CONNECTION ACCEPTS: 448 [/home/jblatt/src/duderino/everscale/http/http1/source/ESHttpServerSimpleCounters.cpp:40]
[1595691738:14892:WRN] SERVER AVG TRANS PER CONNECTION: N=448, MEAN=5591.52, VAR=4151887.43, MIN=1.00, MAX=7404.00 [/home/jblatt/src/duderino/everscale/base/source/ESBSharedAveragingCounter.cpp:33]
[1595691738:14892:WRN] SERVER CONNECTION ACCEPTS: 504 [/home/jblatt/src/duderino/everscale/http/http1/source/ESHttpServerSimpleCounters.cpp:40]
[1595691738:14892:WRN] SERVER AVG TRANS PER CONNECTION: N=504, MEAN=4970.24, VAR=112291.76, MIN=4585.00, MAX=5410.00 [/home/jblatt/src/duderino/everscale/base/source/ESBSharedAveragingCounter.cpp:33]
b6133cdecbb347431451c293fca62704f09c8bd0:
[1613339818:13448:WRN] SERVER CONNECTION ACCEPTS: 580 [/home/jblatt/src/duderino/everscale/http/http1/source/ESHttpServerSimpleCounters.cpp:40]
[1613339818:13448:WRN] SERVER AVG TRANS PER CONNECTION: N=580, MEAN=4310.34, VAR=23385506.24, MIN=9.00, MAX=25907.00 [/home/jblatt/src/duderino/everscale/base/source/ESBSharedAveragingCounter.cpp:33]
[1613339818:13448:WRN] SERVER CONNECTION ACCEPTS: 770 [/home/jblatt/src/duderino/everscale/http/http1/source/ESHttpServerSimpleCounters.cpp:40]
[1613339818:13448:WRN] SERVER AVG TRANS PER CONNECTION: N=770, MEAN=3246.75, VAR=127045.33, MIN=2.00, MAX=3498.00 [/home/jblatt/src/duderino/everscale/base/source/ESBSharedAveragingCounter.cpp:33]
- Connection reuse false test seems to be borked. Does loadgen keep reusing the connection after close? Clue:
[1613340628:15905:WRN] [prox-server:127.0.0.1:38719>127.0.0.1:51460,1535] cannot send server response: Cleanup [/home/jblatt/src/duderino/everscale/http/proxy/source/ESHttpRoutingProxyHandler.cpp:167]
- EpollMultiplexer dtor calls cleanup handlers on all active connections. This returns them to the client/server socket factories instead of destroying them.
- create WAN test using a special docker image with: tc qdisc add dev lo root netem delay 100ms 20ms distribution normal
- interop test with nodejs: https://github.com/apache/trafficserver/pull/131/files#diff-7044deb3d91355abdd40d4f21f39a354ef0de8eb18d1f9d49e147cea927729b0
- proxy handler should sanitize/drop/merge inbound request headers and outbound response headers
- add test with really small socket buffers
- Add test case to connection pool test - server closes all connections while client connections are in pool.
- ipv6 support
config:
- add JSON parser to base.
- Create JsonElement subclass of EmbeddedMapElement. Subclasses must implement type(), find(), and possibly visit() pure virtuals.
- Create JsonMap (wrap EmbeddedMap), JsonArray (wrap EmbeddedList), JsonString, JsonInteger, JsonFloat, JsonBoolean, and JsonNull subclasses of JsonElement - all of these must return NOT_IMPLEMENTED and assert if key() is called on them.
- Subclass JsonElement for Pairs: JsonStringPair, JsonIntegerPair, JsonFloatPair. These take any JsonElement value and implement key().
- JsonObject only allows adding Pairs. JsonArray only allows adding non-Pairs.
- Create chained find() functions that always return a object you can call find() on - return a JsonNotFound object so you can call repeatedly in the chain without barfing on a non-terminal miss. Then just check it once at the end.
- Parser uses https://github.com/zserge/jsmn with DiscardAllocator for all memory allocation and EmbeddedList of JsonElements as a LIFO stack. Pass in const char *s.
- add JsonConfigFile to base
- Use fread, etc to feed bytes to the parser
- Protect root node with ReadWriteLock - getting the root node should return a lock that can be passed to a scope lock: ReadScopeLock lock = config.root(&root);
- Better: subclass ReadScopeLock to store both lock and root internally - construct with a pair struct copied by value from config.root(). This object should only return const reference to the root node.
- Integrate with SignalHandler to re-read config file on SIGHUP
- Update config with RCU - get write lock, swap root nodes, reset allocator used for old root node - probably just maintain 2 discard allocators and somehow track old vs. new
- yaml to JSON conversion
- enforce config file versioning
- discard allocator per parse action.
- discard allocator per parsed entity. take block_size hint/override from config.
- RCU update per entity id. entity id is the unit of config update.
- Update and use global settings
- Config language additions:
- Support local setting of cluster loads, potentially by tying them to limits. Can then be adjusted based on arbitrary outbound response headers (e.g,. my current load is 20), etc.
- Max requests per connection. Requires an attribute count
- Retry requests. Requires an attribute count + outbound request state + backtrack into inbound request DAG.
- Canarying and incremental rollouts. Consistent hashing vs. naive - use to pick a LOAD_BALANCER
- Building blocks: conditions, actions, rules, transitions, expressions
- Add marks and annotations to connections, requests, and responses
- Define an expression language to do things like rewrite URLs with a subset from the original request (redirect
use case), select values >, >=, <, <=, ==, != (WAF anomaly scores and remaining retries), arithmetically adjusting
- For value of 301 Location header
- For metrics
- For rewrites
- For annotations (include subset of request in an annotation's value)
- Conditions: PROPERTY (READ), ACQUIRE_SLOT, GEO_COUNTRY,
- Actions: CREATE, UPDATE (Mark, Attributes, Headers, URI), DELETE, LOG, MEASURE, transition,
- CRUD actions
- ports with TLS config
- filter DAGs
- Filters: RULE_LIST, EXTENSION, REQUEST, CLOSE, CLEANUP, RESPONSE, LOAD_BALANCER, SEND ...
- Indices: PATH_WILDCARD_MAP, FQDN_WILDCARD_MAP, CIDR_MAP, STRING_MAP, ...
- Geo: COUNTRY_CODE_MAP
- Clusters
- TLS indexes
- Limits: max, rate, top n
- Seco range https://github.com/eam/libcrange.
- CIDRv4 and CIDRv6
- WAF + security rules + SANITIZE
- EXTENSION: extern C extension API, pass through config
- Validation:
- Ports: ip+port pairs should always be unique. If INADDR_ANY is used with a port, only INADDR_ANY can be used.
- All referenced ids should exist and be the right type
- Max string lengths and max integer limits
- type, entity, transitions, etc string names map to valid enums.
- id values are valid UUIDs (so they can be converted to more compact binary representations).
- id values are unique or at least unique within type.
- files like certs, keys, and shared libs actually exist
- All filter DAGs must really be DAGs - no cycles... with the exception of retrying requests
- All filter DAGs must terminate with a CLEANUP filter
- All elements must have mandatory fields present (e.g., TLS contexts always need a default context)
- Marking entities before they could exist (e.g., marking an inbound request in an inbound connection)
- Transitioning into another DAG with the exception of retrying OUTBOUND_REQUEST
- Max 127 distinct marks per entity type (e.g., INBOUND_CONNECTION, OUTBOUND_REQUEST, etc)
- config file version is not supported
- An extension's API version is not supported
- An extension cannot be loaded for any reason (extensions are dlopen()ed during the validation phase)
- LEAST_LOADED iteration is used on a cluster with non-ENDPOINT_LOADS
- WARNING: Clusters: CIDR and Seco ranges and scalars should not intersect
- WARNING: No ids of any type should exist if they are not reachable from an inbound port object
- Support haproxy, ATS, etc config FEs that generate BE JSON (barf on extensions)
- golang, ruby, or python internal DSL for generating BE JSON
- Use https://developers.google.com/blockly to create a visual config file editor and host it behind the proxy. Custom renderer can show a graph view of the filters. Custom generator can spit out config files. Somehow load existing config files to visualize and modify.
- Script that reads config file and spits out a dot file for graphviz.
- Consider exposing connection/request/response state to condition expressions
- Consider moving TLS section from ports to inbound connection
- Consider moving TLS section from clusters to outbound connection
- Consider expressing connection pool in outbound connection DAG
- Consider load shedding by closing listening sockets
- Consider exposing callbacks that can be invoked when a connection/request/response transitions to a set of states
- Consider expressing L7 protocol as a connection filter
- Consider expressing send response as an action, not a transition to a dedicated filter
- Use cases:
- Max requests per connection. Requires an attribute count
- Retry requests. Requires an attribute count + outbound request state + backtrack into inbound request DAG.
- Canarying and incremental rollouts. Consistent hashing vs. naive
- Retries (backtrack into request filter DAG)
- Routing and basic load balancing
- Hierarchical failover (local, zonal, regional)
- Rewrites/transformations: inbound, outbound X request, response
- WAF and sanitizers
- Anti-DoS (preventing clients from using more than their fair share)
- Load shedding (protecting the proxy itself): stop accepting, accept and close, delay request
- End to end backpressure from outbound overload to inbound delay (protecting the origin)
= Filter extensions with fail open and fail closed examples
- Marking to model complex matches
- Marking to relay state across connections, requests, and responses X inbound, outbound
- Marking to modify control flow
- Cluster definition via CIDR and Seco ranges, FQDN and ip:port pair literals
- Firewall scenarios: default deny with CIDR and Geo and more sophisticated exceptions
- TLS indexes. Server side SNI serving and client side selection
- Marking and routing based on TLS properties
- mTLS-based access controls, and anonymous access
- Metrics: increment, decrement, set integer/float values and average/percentile latency
- Error Logging: severity + printf style formats referring to subsets of the request/response/connection etc.
- Tracing: adding trace id headers, appending information to existing trace headers.
- Cookie/token validation and issuing using custom extension
- Randomized exponential backoff for retries
- Rate limit on arbitrary keys (connections, inbound request attributes, etc)
- Rate limit by adding delay
- Remote update protocol. Just call the same internal APIs as the config file based updater.
- Support no-op updates: checksum each id. If the checksum doesn't change just R not RCU? Deal with race condition.
metrics:
- add metrics store to base
- Create Metric subclass of EmbeddedMapElement. Use fixed-length string keys
- Allocate with discard allocator and recycle dead Metrics
- Use SharedEmbeddedMap with 0 locks - create a teeny convenience wrapper for this so it's obvious it's not locked?
- Use timing wheel to track and evict idle metrics
- Use per-silo/per-multiplexer, so no synchronization required
- add metrics aggreation
- Main thread periodically and push Commands to all multiplexers to all metric from multiplexer/silo and update root metrics store.
- Push Commands sequentialy (wait for each to finish before pushing next command) and add locking to root metrics store updates if TSAN complains
tls:
- HTTP server should assert that client hostname matches server certificate (CN or SANs).
- Consider option for HTTP server redirecting request if client hostname matches a more specific server certificate.
- support removal of tls contexts from context index - overshadowed SANs should emerge.
- evaluate against https://tools.ietf.org/html/rfc2818
- revisit HttpClientSocket and HttpServerSocket wantRead and wantWrite: allow the statemachine to finish the TLS handshake even when the stream is paused.
- add secure/insecure bit to connected socket log address
- use password protected private keys
- server-side validation - verify issuer chain iff client cert is presented
- Create integration test that exceeds chain length X client and server
- Create integration test that presents cert for valid CN/valid SAN that is signed by a non-trusted CA
- Create integration test that presents expired TTL
- Create integration test that presents tampered cert
- Create integration test that presents tampered private key
- client side session resumption
- server side session resumption
- can boringssl lib's memory be reconciled with allocator framework?
- enable ASLR
- audit initialization for FIPS = https://boringssl.googlesource.com/boringssl/+/master/crypto/fipsmodule/FIPS.md and https://github.com/envoyproxy/envoy/commit/a734887ad06609cf0b3c023d38239bf3e79d3717
- perfect forward secrecy
- ocsp staples
scheduling:
- hierarchical timing wheels to reduce ESB_OVERFLOW errors when scheduling timeouts beyond the timingwheel window.
- add delay support (call me back later) with cancellation to epoll sockets and handlers
daemon:
- Base class with preFork and postFork functions. Subclass can bind to privileged ports in preFork
- If configured, drop privs after postFork
- Slurp keystore readable only by root into memory before dropping privs. Use for private key password
- start() takes over thread and waits for sigchild... restarting child processes that exit non-zero
- catch SIGTERM and relay SIGTERM to child pid
- maintain counters that track restarts
- crash loop handling - exponential backoff with jitter?
proxy:
- add a test for origin that sends a connection:close without a content-length and without a transfer-encoding. Proxy should treat the rest of the stream as the body.
- for proxy initiated connections, do not blindly relay connection close headers and content-length/transfer encoding headers from the client and server.
- relay error codes to handler.endTransaction(). Instead of returning true|false in handleRead(), etc, return ESB::Error to the multiplexer and pass these back to the handlers.
- move HttpServerHandler, HttpClientHandler, HttpMultiplexer and their deps into http-common. Leave HttpMultiplexerExtended in http1
- do not wait to receive entire request body before sending response body. Instead stream them concurrently
- add protocol coverage/interoperability tests:
- (no request body/wait for socket close, unencoded request body / content length 0, chunked request body) X (client,server) X (request,response)
- pipelining client that doesn't wait for the current transaction to complete before sending the next request
- concurrent request/response streaming
- correct handling when the server sends a response early... when shoudl it close the connection before finishing the response?
- add stress tests:
- proxy: backpressure works for clients that read response body slowly
- proxy: backpressure works for servers that read request body slowly
- server: mangled http client
- client: mangled http server
- Reduce proxy handler copying with a swapping rule (from recv to send): swap iff send is empty and recv has no data from the next transaction in it. Else copy
- release send buffer when transitioning to parse headers. acquire send buffer when transitioning to format headers (and later for bidy body streaming)
- release recv buffer when transitioning to format headers. acquire recv buffer whwn transitioning to parse headers (and later for bidi body streaming)
performance and efficiency:
- pahole --reorganize
- use unlikely/likely macros with -freorder-blocks or -freorder-blocks-and-partition
- measure working set size and L1/L2/L3 cache misses
- do not unconditionally close server sockets after sending an error response - make that configurable and add a test for it
- modify multiplexer - first write, then read.
- Consider compacting recvBuffer unconditionally for every ServerSocket and ClientSocket handleReadable event. This would do more memmoves in exchange for fewer socket recvs.
- Consider using low watermarks for compacting and filling recvBuffer
- Server socket: release send and recv buffers in between transactions unless there's data for the next transaction in the app or socket recv buffs.
- Consider always registering for read and write events and just discard events we don't care about / avoid a bunch of epollctl MOD calls.
- Consider filling the recv buffer with data for the next transaction but not processing it until the current transaction completes.
- compare performance of RELEASE vs RELEASENOPOOL vs. RELEASENOPOOL+tcmalloc on bare metal. If allocators aren't the best option, limit their use to HttpTransaction and use standard new/delete/new[]/delete[] everywhere else.
- try https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=70da268b569d32a9fddeea85dc18043de9d89f89
- TCP tx and rx 0 copy
- RSS and pinning threads to cores
- ktls sockmap and splice
- TFO
- Experiment with larger userspace and TCP buffers
- each multiplexer allocs array of max fd epoll events. instead divide max fds by num multiplexers? woudl ahve to ensure that no more than num max fds will be used in a multiplexer - don't accept new connections unless you have 2 fd free (1 fd for server connection, 1 fd for client connection)
- hierarchical timing wheels to reduce memory usage for timers
- wildcard indicies - support per-silo vs. global option (currently global only)
- TLS_AES_128_GCM_SHA256
- https://en.wikipedia.org/wiki/AES_instruction_set
- https://en.wikipedia.org/wiki/Coffee_Lake
- https://ark.intel.com/content/www/us/en/ark/products/191789/intel-core-i9-9900-processor-16m-cache-up-to-5-00-ghz.html
- https://www.cyberciti.biz/faq/how-to-find-out-aes-ni-advanced-encryption-enabled-on-linux-system/
connection pool rework:
- fix regression caused by switch from rb tree to hash table for connection pool. https://docs.google.com/spreadsheets/d/1UOR-96YUOYD5poKrTPQv_GthBPWZb_fJzd6D4k0aNIk/edit#gid=2084099544
- create option of global or per-silo connection pool and set default to per-silo.
- enforce max connections per destination address (ip + port + transport + encrypted tuple)
- evict old connections
code health:
- Create macros for placement new / add noexcept and allocator references to all uses of placement new
- Share more code between HttpClientSocket and HttpServerSocket
- replace all these memcpy's of ESB::Buffer internals with a convenience function
- Fewer log levels. just error, warning, info, debug
- Naming consistency: pool to factory, create to acquire, destroy to release, handler to callback, getFoo to foo
- use allocator cleanup handlers in embedded list test and more widely
- use references instead of pointers when they shouldn't be NULL
- reduce the number of name() and getName() functions
- rework context name() composition/hierarchy. Perhaps use variadic args in TCPConnectedSocket?
router:
- create async router infra. How can the async response be relayed? callback function?
- server trans routing: resolve to VIP/hostname. Use right after HTTP request headers have been received in ESHttpServerSocket.
- client trans routing: resolve to destination address. Use in ESHttpClientSocketFactory::ExecuteClientTransaction()
- make VIP/hostname first class member of transaction. only routers can set it. a naive router could set it to the host:port from the request URI
- create simple DNS based async router impl. Threadpool + synchronous DNS system resolver + local TTL and LRU-based cache.
- CIDR ranges
- Seco range - https://github.com/eam/libcrange, https://github.com/square/grange
load shedding:
- pause/resume for http listening sockets (overloaded use case). support PAUSE returns from HttpServerHandler::acceptConnection
- load shedding dimensions - inbound connections, outbound connections, memory...
- per-destination/origin load shedding
http stack test
- support many virtual IPs in client server tests so we can exercise more ports. Current max tested is 25k client -> 25k server.
- pin threads / fe-be connection alignment / packet steering
server:
- start as root, bind to ports/call initialize(), drop privs, call start()
- readiness and liveness check support - e2e vs local options
- plugin api
allocators:
- add ctrs for discard allocator to track extra large chunk allocations
- add counters to track stranded memory in discard allocators
- create an option to disable all allocators - just forward to the source allocator unless you are system. Then see if there are memory leaks and measure perf difference. also measure perf difference vs. tcmalloc
- Should placement new automatically set cleanup handler if object has one? Put cleanup handler in ESB::Object?
containers:
- lockless linked list
- lockless hash table which uses lockless linked lists
cmake:
- doxygen
- code coverage
- addr space randomization
- more *san (undefined behavior detector, etc)
async dns client:
- cares
- start with caching sync impl and async interface. cache according to the dns record ttl and LRU
- common caching code. intial async impl can use threadpool and gethostbyname
tcp proxy stuff:
- make sure full duplex
- track bps
- turn into a unit test over loopback/same proc
http stuff
- functions to hex encode/decode header fields
- merge duplicate headers
- add user agent/server agent headers
- remove sensitive headers while proxying
- fix double add of Transfer-Encoding: chunked.... strip headers that control connection lifetime during proxying
- max requests per connection option (1 disables keepalives, 0 is unlimited)
- max header size option
- max body size option
- save don't skip trailer
- slow loris defense - not just idle time, also max transaction latency
- h2 support, move parsing and formatting state from sockets to transactions
- support 100 Continue - server side already supported? client side - do after connection reuse?
- support max requests per connection
- support receiving half closed from client while still sending outstanding response
- make all the unsigned chars in HttpMessage, HttpRequest, and HttpReponse chars? UTF-8 only in encoded form?
WAF
- https://coreruleset.org/
URL
- canonicalization / encode and decode per https://url.spec.whatwg.org/ and https://chromium.googlesource.com/chromium/src/+/HEAD/url/url_canon_path.cc
- fully but lazily parse query string - if requested parse into a list of key value pairs.
logging
- rotating file logger - batch into memory buffers and flush occasionally. double buffer. fill one while flush other. if one to be filled is too small, drop messages and increment drop counter.
- add trace id extension header, copy across all transactions, and include in log messages
metrics:
- calc percentiles in latency - https://www.codeproject.com/Articles/25656/Calculating-Percentiles-in-Memory-bound-Applicatio
- granular metrics for client and server sockets - break down failures by state and type (idle, error type)
- metrics for TLS client and server errors
- connection pool hit rate (averaging counter, add 1 for hit, 0 for miss)
managed:
ManagedProxy:
dag_store:
define dag element interface.
TODO write this logic:
ProxyTransaction: Put in ProxyHandler
returns SUCCESS, AGAIN, ERROR. ESF::Error
inbound_connection_state takes inbound_connection object
inbound_request_state takes inbound_request and inbound_connection objects
outbound_connection_state takes inbound_connection, outbound_connection, inbound_request objects
outbound_request_state takes inbound_connection, outbound_connection, inbound_request objects
...
ProxyTransaction has a UUID value for (inbound,outbound) X (connection, request, response). At init time,
only inbound_connection and inbound_request dags are set?
State objects must request state transitions, ProxyTransaction must enact....
struct with: return code (SUCCESS, AGAIN, TRANSITION, ERROR/BREAK), if transition: dag type + UUID
inbound_response_state takes inbound_request, inbound_response, and inbound_connection objects
inbound_connection_dags transitions to:
inbound_request_dag + UUID
inbound_request_dag transitions to:
outbound_request_dag + UUID
inbound_response_dag (to send response without waiting for origin)
inbound_connection_dag (to close connection)
inbound_response_dag transitions to:
inbound_connection_dag
outbound_connection_dag transitions to:
inbound_request_dag + UUID
outbound_request_dag transitions to:
inbound_response_dag
inbound_connection_dag
outbound_connection_dag
inbound_connection_dag_elements
inbound_response_dag_elements - some of these can be routers that map to clusters
inbound_respond_dag_elements
outbound_connection_dag_elements
outbound_request_dag_elements
outbound_response_dag_elements
clusters:
clear_cluster:
address list
load balancing policy
outbound_connection_dag ref
outbound_request_dag ref
outbound_response_dag ref
tls_cluster subclass of clear_cluster:
tls_client_index ref
composite_cluter:
list of cluster refs
load balancing policy (for load balancing - surface absolute and % loaded calculations)
1+ Ports:
1+ ClearPort:
listening_addresses
listen_backlog
inbound_connection_dag ref
inbound_request_dag ref
inbound_response_dag ref
1+ TLSPort: public ClearPort
tls_server_index ref
- Split TLS_IDX into TLS_SERVER_IDX and TLS_CLIENT_IDX
- Merge TLX_CTX into TLS_SERVER_IDX and TLX_CLIENT_IDX
- ES::ConfigStore with RCU semantics