Google: Difference between revisions

← Older edit

Google (view source)

Revision as of 06:15, 13 July 2021

2,116 bytes added , 2 years ago

→‎System Design

Amanjosan2008

Bureaucrats, Administrators

4,400

edits

Revision as of 21:16, 18 December 2019 (view source) Amanjosan2008 (talk \| contribs) (→‎Answers) ← Older edit		Latest revision as of 06:15, 13 July 2021 (view source) Amanjosan2008 (talk \| contribs) (→‎System Design)
(12 intermediate revisions by the same user not shown)
Line 87: * Mutex * Threads A process is a collection of code, memory, data and other resources. A thread is a sequence of code that is executed within the scope of the process. You can (usually) have multiple threads executing concurrently within the same process. = Questions = Line 112 ⟶ 114: ;1 MBPS speed? whats wrong Take Packet Captures Check Duplex settings Line 120 ⟶ 123: Pearing Latency Congestion BW/Delay product BDP QoS, Filtering, Routing, ISP issues TCP Window size WSF * Maximum Possible Transfer Rate = TCP Window Size/RTT Where RTT is Ping Response in Millseconds/1000. There are only two things you can do to affect your data thru-put on a wide area network: Increase your TCP window size, or Reduce latency. Packet capture Check for retransmissions after 200 ms --> RTOs Large send offload (LSO) is a technique for increasing egress throughput of high-bandwidth network connections by reducing CPU overhead. TSO It works by passing a multipacket buffer to the network interface card (NIC). The NIC then splits this buffer into separate packets. The technique is also called TCP Segmentation Offload (TSO) when applied to TCP, or generic segmentation offload (GSO). A similar concept to large segment offload for ingress traffic is large receive offload (LRO). LSO and LRO are independent and use of one does not require the use of the other. Fragmentation Packet Loss Network throughput impacted by TCP window size, Latency and Congestion Window Size Maximum amount of data a sender can send before receiving an acknowledgement. Standard TCP Window Size = 65K bytes It’s not just about latency, TCP doesn’t like congestion Adding more traffic produces a negative marginal effect above about 30% utilization Application is able to generate 10 GBPS traffic? OS limits - CPU - Memory, Network Card speed? Window scaling changes the TCP window to: 64KB * 2n (n = window scale factor) With a window scale factor of 7, which equals a TCP window of 8MB Single-flow throughput is limited to: TCP window size / RTT Without window scaling, TCP is limited to: 64KB / 100ms = 5 Mbps With CloudBridge default window scale, TCP is limited to: 8MB / 100 ms = 650 Mbps WSF 64kB to 8MB SACK to minimize data that is resent Fast re-transmits to reduce delay before resend ;Bandwidth Delay Product Amount of data that can be in transit (flight) in the network Bandwidth-delay product is the product of a data link's capacity (in bits per second) and its round-trip delay time (in seconds). The result data is equivalent to the maximum amount of data on the network circuit at any given time, Or data that has been transmitted but not yet acknowledged. A network with a large bandwidth-delay product is commonly known as a Long Fat Network(LFN). A network is considered an LFN if its bandwidth-delay product is significantly larger than 105 bits (12,500 bytes). ;BIC TCP (Binary Increase Congestion control) BIC TCP for faster recovery from packet loss Allows bandwidth probing to be more aggressive initially when the difference from the current window size to the target window size is large, and become less aggressive as the current window size gets closer to the target window size. A unique feature of the protocol is that its increase function is logarithmic; it reduces its increase rate as the window size gets closer to the saturation point. BIC is optimized for high speed networks with high latency: so-called "long fat networks". For these networks, BIC has significant advantage over previous congestion control schemes in correcting for severely underutilized bandwidth. BIC implements a unique congestion window (cwnd) algorithm. This algorithm tries to find the maximum cwnd by searching in three parts: binary search increase, additive increase, and slow start. When a network failure occurs, the BIC uses multiplicative decrease in correcting the cwnd. BIC TCP is implemented and used by default in Linux kernels 2.6.8 and above. The default implementation was again changed to CUBIC TCP in the 2.6.19 version. ;CUBIC TCP CUBIC is an implementation of TCP with an optimized congestion control algorithm for high bandwidth networks with high latency (LFN: long fat networks). CUBIC TCP is implemented and used by default in Linux kernels 2.6.19 and above, as well as Windows 10 & Windows Servers. It is a less aggressive and more systematic derivative of BIC TCP, in which the window size is a cubic function of time since the last congestion event, with the inflection point set to the window size prior to the event. Because it is a cubic function, there are two components to window growth. The first is a concave portion where the window size quickly ramps up to the size before the last congestion event. Next is the convex growth where CUBIC probes for more bandwidth, slowly at first then very rapidly. CUBIC spends a lot of time at a plateau between the concave and convex growth region which allows the network to stabilize before CUBIC begins looking for more bandwidth. Another major difference between CUBIC and standard TCP flavors is that it does not rely on the cadence of RTTs to increase the window size. CUBIC's window size is dependent only on the last congestion event. With standard TCP, flows with very short round-trip delay times (RTTs) will receive ACKs faster and therefore have their congestion windows grow faster than other flows with longer RTTs. CUBIC allows for more fairness between flows since the window growth is independent of RTT. == SSH to remote server == Line 140 ⟶ 229: It not, is there any network disconnection/lantency/congestion? SSH Packets are reaching Destination? TCPDump on Destination(look for initial TCP packets with the SYN flags). ~~If not check:~~ If not check: - Using IP Correct in client side. - DNS Correct? try host file entry or direct ip access Line 152 ⟶ 242: - Reply packets going or not? if going, check Router reverse routes. - IPTables Rejecting packets (Flush IPTabeles -F) - Packets received on correct interface? (SSH might be listening on wrong ~~port~~interface/address - ListenAddress ) - SSH ~~from~~to Server tofrom localhost Any traffic reaching Server? ifconfig RX, TX count bytes increasing ~~decreasing; If not check:~~. If not check: - Networking Config - Interface check? correct ip on correct interface? Line 237 ⟶ 328: Trade off you made in this? Used VMs ~~ony~~only Could have used Physical devices - Cisco Call Manager, Physical CBs, Cisco Routers, IPNetSim, IP Phones Used devices across 2 DCs in India & US. Line 244 ⟶ 335: 3-tier network design in public cloud. How do you design? Front End Application ~~A 3-tier application architecture is a modular client-server architecture that consists of a presentation tier, an application tier and a data tier.~~ Backend Application ~~The data tier stores information, the application tier handles logic and the presentation tier is a graphical user interface (GUI) that communicates with the other two tiers.~~ Database ~~The three tiers are logical, not physical, and may or may not run on the same physical server.~~ '''You need LoadBalancer''' ~~Presentation tier~~ ~~This tier, which is built with HTML5, cascading style sheets (CSS) and JavaScript, is deployed to a computing device through a web browser or a web-based application.~~ ~~The presentation tier communicates with the other tiers through application program interface (API) calls.~~ ~~Application tier~~ The application tier, which may also be referred to as the logic tier, is written in a programming language such as Java and contains the business logic that supports the application’s core functions. ~~The underlying application tier can either be hosted on distributed servers in the cloud or on a dedicated in-house server, depending on how much processing power the application requires.~~ ~~Data tier~~ ~~The data tier consists of a database and a program for managing read and write access to a database.~~ ~~This tier may also be referred to as the storage tier and can be hosted on-premises or in the cloud.~~ ~~Popular database systems for managing read/write access include MySQL, PostgreSQL, Microsoft SQL Server and MongoDB.~~ Benefits: ~~Modularity~~ ~~Teams can focus on different tiers of the application and changes made as quickly as possible.~~ ~~Helps us recover quickly from an unexpected disaster by focusing solely on the faulty part.~~ ~~Scalability:~~ ~~Each tier of the architecture can scale horizontally to support the traffic and request demand coming to it.~~ ~~Can be done by adding more EC2 instances to each tier and load balancing across them.~~ ~~High Availability:~~ ~~With the traditional data centre, our application is sitting in one geographical location.~~ ~~With AWS, we can design our application in different locations known as the availability zones.~~ ~~Fault Tolerant:~~ ~~We want our infrastructure to comfortably adapt to any unexpected change both to traffic and fault.~~ ~~This is usually done by adding a redundant system that will account for such a hike in traffic when it does occur.~~ So instead of having two EC2 instances working at 50% each, such that when one instance goes bad, the other instance will be working at 100% capacity until a new instance is brought up by our Auto Scaling Group, we have extra instance making it three instances working at approximately 35% each. ~~This is usually a tradeoff made against the cost of setting up a redundant system.~~ ~~Security:~~ ~~Application will communicate within themselves with a private IP.~~ ~~The presentation (frontend) tier of the infrastructure will be in a private subnet (the subnet with no public IP assigned to its instances) within the VPC.~~ ~~Users can only reach the frontend through the Application Load Balancer.~~ ~~Backend and Database tier will also be in the private subnet because we do not want to expose them over the internet.~~ ~~We will set up the Bastion host for remote SSH and a NAT gateway for our private subnets to access the internet.~~ Will you need LB? Why? Line 359 ⟶ 412: Room for improvement? == ~~Answers~~Dublin == ~~== Bandwidth Delay Product ==~~ ~~Amount of data that can be in transit (flight) in the network~~ ~~Includes data in queues if they contributed to the delay~~ ~~BDP (bytes) = total_available_bandwidth (bps) x round_trip_time (sec) / 8~~ == Interview 1 (45 mins) == ;Technical RRK - Foundational technical knowledge where the interviewer will ask you at least one question from 2-3 areas based on your expertise. This would include areas like (but not all) ~~== BIC TCP (Binary Increase Congestion control) ==~~ * Connectivity (Layer-2 switching) * Connectivity (Layer-3 IP routing) * Network hardware devices * Network hardware performance * Webtech/troubleshooting/SQL ;Troubleshooting - Focus on process of troubleshooting rather than technical details BIC TCP for faster recovery from packet loss Networking troubleshooting questions to demonstrate you can apply knowledge of Networking specific technologies to troubleshoot specific problems Allows bandwidth probing to be more aggressive initially when the difference from the current window size to the target window size is large, and become less aggressive as the current window size gets closer to the target window size. ;System Design - ~~A unique feature of the protocol is that its increase function is logarithmic; it reduces its increase rate as the window size gets closer to the saturation point.~~ * Designing technical solutions BIC is optimized for high speed networks with high latency: so-called "long fat networks". For these networks, BIC has significant advantage over previous congestion control schemes in correcting for severely underutilized bandwidth. == Interview 2 (45 mins) == BIC implements a unique congestion window (cwnd) algorithm. ;HrM Interview - This interview is based around the below criteria. This is some but not all they will cover. This algorithm tries to find the maximum cwnd by searching in three parts: binary search increase, additive increase, and slow start. * Handling customer issues When a network failure occurs, the BIC uses multiplicative decrease in correcting the cwnd. Long term customer success * Escalating handling BIC TCP is implemented and used by default in Linux kernels 2.6.8 and above. The default implementation was again changed to CUBIC TCP in the 2.6.19 version. ;A few additional tips which may help in your preparations: ~~== CUBIC TCP ==~~ * Talk through your thought process about the questions you are asked. In all of Google's interviews, our engineers are evaluating not only your technical abilities but also how you approach problems and how you try to solve them. CUBIC is an implementation of TCP with an optimized congestion control algorithm for high bandwidth networks with high latency (LFN: long fat networks). CUBIC TCP is implemented and used by default in Linux kernels 2.6.19 and above, as well as Windows 10 & Windows Servers. It is a less aggressive and more systematic derivative of BIC TCP, in which the window size is a cubic function of time since the last congestion event, with the inflection point set to the window size prior to the event. Because it is a cubic function, there are two components to window growth. The first is a concave portion where the window size quickly ramps up to the size before the last congestion event. Next is the convex growth where CUBIC probes for more bandwidth, slowly at first then very rapidly. CUBIC spends a lot of time at a plateau between the concave and convex growth region which allows the network to stabilize before CUBIC begins looking for more bandwidth. Ask clarifying questions if you do not understand the problem or need more information. Many of the questions asked in Google interviews are deliberately underspecified because our engineers are looking to see how you engage the problem. In particular, they are looking to see which areas leap to your mind as the most important piece of the technological puzzle you've been presented. Another major difference between CUBIC and standard TCP flavors is that it does not rely on the cadence of RTTs to increase the window size. CUBIC's window size is dependent only on the last congestion event. * Think about ways to improve the solution you'll present. In many cases, the first answer that springs to mind isn't the most elegant solution and may need some refining. It's definitely worthwhile to talk about your initial thoughts to a question, but jumping immediately into presenting a brute force solution will be received less well than taking time to compose a more efficient solution. With standard TCP, flows with very short round-trip delay times (RTTs) will receive ACKs faster and therefore have their congestion windows grow faster than other flows with longer RTTs. CUBIC allows for more fairness between flows since the window growth is independent of RTT. * When asked a question by the interviewer '''your answer should''' be: SPECIFIC: inline with the question, no digressions. SYNTHETIC: be factual, simple, short sentences, no long explanations. Inform the interviewer that you have additional info if he would like to know more. TECHNICAL: your interviewer is an expert and they love to hear deep technical details in your answers (relevant to the question of course)... They are eager to learn something new. OPEN: if you don't know the answer to the question it is not a problem but please please please tell the interviewer, don't pretend that you know by just guessing the answer. Tell them that you will attempt a solution by using the following X Y Z elements. ALWAYS explain the reasoning that you are following. FLUID: keep the conversation going, no long blank moments, if you need additional information about the problem you are asked to solve, ask the interviewer. The interview is a conversation not a monologue, not an exam.