Latest revision as of 23:46, 3 December 2019

Proxy Server

Capture Traffic passing though a Transparent Proxy

The filters to be put into a firewall(ScreenOS here) to capture complete packet flow across a firewall.

set ff src-ip 10.1.1.1 dst-ip 144.32.56.43
set ff src-ip 192.168.1.1 dst-ip 144.32.56.43
set ff src-ip 144.32.56.43 dst-ip 65.124.55.31
set ff src-ip 144.32.56.43 dst-ip 10.1.1.1

Proxy Server Flow^[1]

Packet flow for HTTP Traffic

Packet flow for HTTPS Traffic

Tail Latency

Source highscalability.com, accelazh.github.io

Imagine a client making a request of a single web server.
Ninety-nine times out of a hundred that request will be returned within an acceptable period of time.
But one time out of hundred it may not. Say the disk is slow for some reason.
If you look at the distribution of latencies, most of them are small, but there's one out on the tail end that's large.
That's not so bad really.
All it means is one customer gets a slightly slower response every once in a while.

Lets' change the example, now instead of one server you have 100 servers and a request will require a response from all 100 servers.
That changes everything about your system's responsiveness.
Suddenly the majority of queries are slow. 63% will take greater than 1 second. That's bad.

Using the same components and scaling them results in a really unexpected outcome.
This is a fundamental property of scaling systems: you need to worry not just about not latency, but tail latency, that is the longer events in your system.
High performance equals high tolerances.
At scale you can’t ignore tail latency.

This latency could come from:

RCP Library
DNS lookups
Disk Slow
Packet loss
Microbursts
Deep queues
High task response latency
Locking
Garbage collection
OS stack issues
Router/switch overhead
Transiting multiple hops
Slow processing code

Other Reasons:

Overprovisioned VMs
Many OS images being forked from a small shared base
A large request may be pegging your CPU/network/disk, and make the others queuing up.
something went wrong as a dead loop stuck your cpu.

The latency percentile has low, middle, and tail parts.
To reduce the low, middle parts: Provisioning more resources, cut and parallelize the tasks, eliminate “head-of-line” blocking, and caching will help.
To reduce the tail latency: The basic idea is hedging.
Even we’ve parallelized the service, the slowest instance will determine when our request is done.
Code freezes--interrupt, context switch, cache buffer flush to disk, garbage collection, reindexing the database

References

↑ www.india.fidelity.com

{{#widget:DISQUS |id=networkm |uniqid=Basic Troubleshooting |url=https://aman.awiki.org/wiki/Basic_Troubleshooting }}

[1] www.india.fidelity.com

[1]

@@ Line 2: / Line 2: @@
 __TOC__
+= Proxy Server =
-= Capture Traffic passing though a Transparent Proxy =
+== Capture Traffic passing though a Transparent Proxy ==
 The filters to be put into a firewall(ScreenOS here) to capture complete packet flow across a firewall.
@@ Line 15: / Line 16: @@
 <br />
-== Proxy Server Flow<ref>www.india.fidelity.com</ref>==
+== Proxy Server Flow<ref>www.india.fidelity.com</ref> ==
 [[File:Proxy_server_flow_non_transparent.png|center]]
 <br />
-=== Packet flow for HTTP Traffic ===
+== Packet flow for HTTP Traffic ==
 [[File:Proxy_server_flow_non_transparent_http.png|center]]
 <br />
-=== Packet flow for HTTPS Traffic ===
+== Packet flow for HTTPS Traffic ==
 [[File:Proxy_server_flow_non_transparent_https.png|center]]
 <br />
+= Tail Latency =
+Source [http://highscalability.com/blog/2012/3/12/google-taming-the-long-latency-tail-when-more-machines-equal.html highscalability.com], [https://accelazh.github.io/storage/Tail-Latency-Study accelazh.github.io]
+*Imagine a client making a request of a single web server.
+*Ninety-nine times out of a hundred that request will be returned within an acceptable period of time.
+*But one time out of hundred it may not. Say the disk is slow for some reason.
+*If you look at the distribution of latencies, most of them are small, but there's one out on the tail end that's large.
+*That's not so bad really.
+*All it means is one customer gets a slightly slower response every once in a while.
+*Lets' change the example, now instead of one server you have 100 servers and a request will require a response from all 100 servers.
+*That changes everything about your system's responsiveness.
+*Suddenly the majority of queries are slow. 63% will take greater than 1 second. That's bad.
+*Using the same components and scaling them results in a really unexpected outcome.
+*This is a fundamental property of scaling systems: you need to worry not just about not latency, but tail latency, that is the longer events in your system.
+*High performance equals high tolerances.
+*At scale you can’t ignore tail latency.
+*This latency could come from:
+ RCP Library
+ DNS lookups
+ Disk Slow
+ Packet loss
+ Microbursts
+ Deep queues
+ High task response latency
+ Locking
+ Garbage collection
+ OS stack issues
+ Router/switch overhead
+ Transiting multiple hops
+ Slow processing code
+Other Reasons:
+ Overprovisioned VMs
+ Many OS images being forked from a small shared base
+ A large request may be pegging your CPU/network/disk, and make the others queuing up.
+ something went wrong as a dead loop stuck your cpu.
+*The latency percentile has low, middle, and tail parts.
+*To reduce the low, middle parts: Provisioning more resources, cut and parallelize the tasks, eliminate “head-of-line” blocking, and caching will help.
+*To reduce the tail latency: The basic idea is hedging.
+*Even we’ve parallelized the service, the slowest instance will determine when our request is done.
+*Code freezes--interrupt, context switch, cache buffer flush to disk, garbage collection, reindexing the database