Tuesday, October 7, 2008

Networking Systems Require Tight Control/Data Plane Integration

A critical element in the design of next-generation networking equipment is integrating the service intelligence provided by silicon with the transport capabilities of optics. An efficient combination of these two functional entities yields essential mechanisms needed to leverage the massive transport capacity available in the network for deploying and delivering new classes of service. The carrier, service provider, enterprise, and metro markets demand a new breed of equipment combining speed, capacity and intelligence to utilize the bandwidth in flexible and profitable ways, while incorporating emerging technologies such as MPLS, traffic engineering, VPN, and differentiated services over Ethernet and Sonet networks. Designing efficient, high-performance networking systems requires careful integration of data path network processors (NPs) with control plane processors, co-processors, fabrics, and application software. Handling a high-speed packet/cell stream requires a multi-chip process flow and unless all of the piece parts work together efficiently, the performance of the system will be less than optimal. In essence, choosing a 10-Gbps NP does not guarantee10-Gbps line-card performance unless the designer makes careful system level integration choices as opposed to component level integration choices.

Data Plane Requirements : The data plane can be viewed as consisting of two different logic blocks for ingress versus egress processing. In most of the architectures, egress processing is the simpler of the two since your fabric subsystem supplies relatively homogenous packets and egress processing mainly consists of traffic management functions. Ingress processing is the difficult task, since the incoming traffic can be a non-homogenous set of packet types, lengths and protocols from a variety of sources. All of those packets/cells must then go through parsing, classification, policing, admission control, and editing/modification at wire speed with bounded latency to sustain application requirements, especially performance.
When considering data plane processors, it is important to understand the demands placed by system applications on the ingress and egress logic of the data plane and also the interaction and integration with the control plane. The key attributes to evaluate are:

1. Classification performance: Search key size, search frequency and latency, associated data size, and the ability to do recursive search operations directly determine the data plane processor's ability to implement advanced admission control at line rate in edge route applications. Parsing flexibility is another feature affecting the breadth of protocols supported by a data plane processor and also impacts classification table size and maintenance complexity.

2. Provisioning performance: The key provisioning attributes of a data plane processor are standards compliance, range of policing rates, policing rate granularity, accuracy, and number of policing operations per packet while sustaining line rate. Policing algorithm support includes Diffserv srTCM, trTCM, and ATM GCRA and F-GCRA. Policing rates range from 8 kbps for VoIP to 10 Gbps for 10 Gigabit Ethernet links. Multiple policing operations per packet allow packets to be marked based on individual and aggregate flow provisioning.

3. Forwarding performance: A data plane processor's forwarding engine must be capable of making all packet modifications necessary at line rate. A key consideration is whether there is sufficient data path speed up to accommodate packet expansion caused by label stacking, tunneling and appending route headers needed by traffic managers and switch fabrics. A forwarding engine should be capable of pre-pending route headers, push/pop/swapping label stacks for MPLS and VLAN, and add or remove tunnels at line rate. Included in this effort are the QoS mappings necessary between protocols (IP, MPLS, Diffserv, 802.1Q, etc) contained within the packet and the route header formation.

4. Control plane integration: A tight coupling of the data plane and control plane processors provides several advantages. The results of the data plane processor's classification engine may be used to prioritize and expedite processing in the control processor. The data processor's policing engines may be used to protect the control plane from denial of service attacks. Most importantly, it allows the control processor to serve as an application processor for those packets requiring processing beyond layer 4.

Control Plane Requirements : Control plane processors are available or under development from Broadcom (Si-Byte), Sandcraft, PMC (QED), IBM and others. These processors are enhanced versions of popular embedded processors such as MIPS and PowerPC, with additional I/O and packet processing functionality. The key advantage they provide is in the area of development tools and ease of leveraging an existing code base. When considering a control plane processor the key attributes to evaluate include:

1. CPU performance and architecture: Data plane processors use customized or application driven instruction sets, whereas control plane processors use standard RISC architectures, so CPU MIPS are a key factor in control plane performance. MIPS, PowerPC, and ARM RISC architectures are suitable for the control plane, so the decision should be based on development environment, familiarity, or other control plane processor features. Control plane processors typically use multiple RISC cores, either loosely or tightly coupled, and tend to have one or two level caches.

2. Memory subsystem performance: The memory bandwidth of a control plane processor must meet the demands of both the RISC processor and communications I/O ports. An efficient implementation will allow the data plane processor to stream data directly into control plane processor shared memory without stalling the RISC cores from protocol processing. Since control traffic is stored and then forwarded, the shared memory bandwidth required will be four to eight times the average control traffic rate. The peak control traffic bandwidth equals the line rate, or 10 Gbps plus route header overhead, for 10 Gigabit Ethernet.

3. Communications I/O: Ideally, the data plane and control plane processors will have glueless interfaces that will sustain line peak rate and control traffic required rate with a bounded amount of latency. Several co-processor and data plane interfaces have been standardized or are under consideration by the Optical Internetworking Forum (OIF) and Network Processing Forum (NPF). For 10-Gbps data plane rates, some options include POS/Utopia-3, SPI 4, HyperTransport, or 3.125 GHz serializers/deserializers (serdes).

Integrating the Planes
A typical 10-Gbps line card contains network processing and traffic management functions between the optics interface and switch fabric. At 10 Gbps, the network processing function is divided between a data plane processor and control plane processor. This assures deterministic line rate performance independent of traffic mix and advanced routing requirements. The data plane and control plane processors should mate such that data traffic (packets or cells) is bi-directional over a glueless interface Figure 1. The SPI 4 interface is suitable since it meets 10-Gbps performance requirements and is widely available. Since the control plane and data plane share the packet processing task, designers should implement a glueless interface between these processors. SPI 4 might be an attractive option in these situations.
Control plane traffic is typically queued in multiple priority queues since it is not processed "on the fly" like data plane traffic. The subport capability of SPI 4 provides the mechanism for transferring multi-priority traffic between the data plane and control plane processors without head of line blocking.
Tight integration of the data plane and control plane processor architectures will enable several performance efficiencies. The control processor may leverage the classification and provisioning completed by the data plane processor. High priority control packets may be directly transferred to a control processor's RISC cache and the RISC CPU may be dispatched based on classification results.
Per flow statistics gathering is increasing in importance as quality of service is added to IP and MPLS networks through differentiated services. These statistics are not only used for billing, but are a key enabler for network engineering and security against denial of service attacks.
Per flow statistics are now a significant control plane bandwidth issue that is solved by tightly integrating the data plane and control plane processors. Statistics information is collected in the data plane and processed in the control plane.
Tight integration of the data and control planes also simplifies end product field support by enabling sampling and full statistics collection on selected flows. The data plane processor samples traffic by copying packets or portions of packets to the control plane processor for analysis. Random sampling may be applied globally across all traffic to support network engineering. Full sampling of selected flows enable problem determination on troublesome network links.
Linking the Software Up to this point, we've looked solely at the hardware requirements needed to marry the control and data plane. However, it's equally important to closely link the software environments used to develop code for these processors. Here's why.
To solve performance bottlenecks, network processors implement complex architectures and memory subsystems that consist of multi-processing, multi-threading, crossbar interconnects, hardware assist and micro-engines, and more. This hardware complexity, however, makes implementation a difficult task.
The architecture complexity problems were compounded by weak toolsets. In the early NP designs, each vendor had to develop a set of software development tools consisting of assemblers, compilers, debuggers and simulators. However, especially on the compiler front, most of these tools were basic and thus designers still had to do a ton of hand-tuned assembly code to make these processors work.
Hand tuning is a tedious and iterative process, since performance feedback comes only after software/hardware integration in the lab and requires intimate knowledge of the hardware micro-architecture. Thus, software development teams are forced to spend the majority of their time trying to fit their application to the NPU architecture, rather than focusing on the characteristics of the application itself.
What is required is a set of powerful software tools that allow the developer to define the data plane behavior of the system utilizing an abstracted graphical programming interface tightly and seamlessly integrated with a C development environment for the control plane.
By utilizing an abstracted graphical programming interface with integrated low-level code generation, the developer can focus on developing the rules required for their application, while the tools takes care of generating the optimized executable code including the C functions required for control plane integration. These tools should also include real-time performance analysis so the developer knows immediately how a particular function will perform, removing the long traditional performance tuning cycle from the development process.
Wrap UpDesigning and implementing systems that can deliver advanced IP/MPLS services over high bandwidth networks requires a great deal of effort to be spent on integration issues. These issues are related to the control plane, data plane, switch fabric, and the efficient mapping of applications between the control plane and the NP-based data plane, such that high performance levels can be sustained under varying traffic loads and patterns.
In choosing an NP, designers should equally weigh the feature set of the device as well as its impact on the overall system, this will ensure that they derive the full performance benefits from the new generation of easy to use application driven network processors.

Friday, August 1, 2008

Possible research projects on Octeon.

1.
More popular of the Internet, more security issues are concerned for the Internet connections. Nowadays, VPN is becoming a very popular method to secure connections on the Internet. By a VPN connection, both terminals can exchange data in a secured tunnel which keeps data integrity and confidentiality. However, establishing a VPN connection costs much CPU process time and many hardware resources. Part of CPU and memory are occupied by encryption and decryption process. Therefore, a dedicate processor to execute decryption/encryption will save much CPU process time and many other hardware resources.Cavium Networks present a series of hardware based on MIPS processor, called Octeon, which provides coprocessor to process decryption/encryption work for faster execution.An open source VPN software Openswan on Octeon can be modified to replace its VPN decryption/encryption functions by hardware accelerator. The accuracy and performance of encryption and decryption processes are validated by comparing hardware and software solutions.
2.
In this century, most data are stored in computers. With the increasing of data, the frequency of using computer to process data is growing much more than before. Therefore, decreasing data capacity is becoming one of the most important issues to reduce cost.During these years, SATA disk is the most cost effective solution to provide storage capacity. However, Cavium Octeon is an embedded system, which is no SATA disk installed. An analysis of Linux Kernel for installing SATA disk is presented for better storage capacity before data compression.There are two ways for data compression, loss compression and lossless compression. Loss compression is usually utilized in image, video and audio processing, as well as lossless compression is usually used for text compression or the environment of low fault-tolerant compression. Cavium Octeon provides a hardware lossless compression solution called zip Coprocessor which is implemented in this research.After the implementation, data compressed by zip Coprocessor can be decompressed by gzip, and the result is same vice versa.

Sunday, July 27, 2008

Network Based Application Recognition

Help Ensure Performance for Mission-Critical Applications: NBAR allows the network to provide differentiated services to each application. You can provide absolute priority and guaranteed bandwidth to your mission-critical applications such as Oracle or an application that runs on a particular Web page. At the same time you can limit the bandwidth consumed by the less essential applications. The end result is that users can access their mission-critical applications with minimal delay without the need to upgrade costly WAN links or cutting off access to commonly used, but not mission-critical, applications.

Reduce WAN Expenses: In many parts of the world, and especially between countries, telecommunications links can still be prohibitively expensive. This leads to a dilemma for the network manager: on the one hand you need to provide access to new client-server and Internet-enabled applications, while on the other hand you need to control WAN service costs. NBAR provides a solution to this problem by enabling you to intelligently utilize WAN bandwidth so that you can provide acceptable service levels with the minimum possible bandwidth.


– Manage Web Response: The Web is now a critical business resource in many enterprises, for both internal and external communications. Employees, partners, and customers must have access to the Web pages they need without such problems as slow downloads or Web-based application failure. NBAR allows you to identify the Web pages and type of Web content that you deem critical.


– Improve VPN Performance: VPNs often reduce networking costs while providing increased flexibility. Unfortunately, the service quality in a VPN is often difficult to guarantee. Running NBAR and VPN concurrently in the same router solves this problem by identifying mission-critical traffic before it is encrypted, allowing the network to apply the appropriate QoS controls. By running both VPN and NBAR concurrently, we help ensure that the packets are processed in the correct order to achieve both maximum security and the appropriate QoS. NBAR can also mark the tunnel packet so that the service provider can provide differentiated service to different applications on the service provider's WAN.

Improve Multiservice Performance: Multiservice networks allow you to combine your data, voice, and video requirements into one unified network. Unfortunately, each of these services requires different network characteristics. NBAR is able to intelligently identify the type of each packet and provide the proper network characteristics.

Thursday, July 24, 2008

Network-Based Entitlement Control or NBEC

From Network World
For all the lovely talk about access control emanating from so-called NAC vendors who must have invoked Merlin to magically transform the unworkable Network Admission Control into Network Access Control, there is still one huge problem with access controls. Most enterprises really have no idea who should have access to what resources. The granularity of access control needed to secure the enterprise is beyond the ken of most IT guys. Let’s face it, knowing what applications, networks, and data sets any one of say 10,000 people should have access to is not a simple problem.
Camelot attempted to address the failings of most identity and access management (IAM) systems by building in a learning component. What happened to Camelot? I wish I knew. For some reason the IT press is great at recording the history of startups as long as they have an active PR program. As soon as vendors start to die the historical record seems to get wiped clean. I would guess that part of the problem was that they were too far ahead of their time. Another issue was they relied on host agents to do the learning and enforcement, a company killer if there ever was one.
Now, in what appears to me to be the
second coming, a new vendor is born from the knights of Cisco. Five top networking guys have apparently recognized that the marketing department at Cisco is not really that good at inventing security solutions (admission control) but that there truly is a need for automated tools to discover and enforce access control policies in the enterprise. The company, Rohati, came out of stealth mode in time for the Gartner IT Security Summit last week in DC. They are calling their technology Network-Based Entitlement Control or NBEC. No agents, automated discovery, policy management. I love it. This could work.
I hope the ever flexible NAC vendors get out of the end point health check business. Then we could have an industry that is all pulling in the same direction: towards better policy management, more granular authorization, and ultimately, better security.

Now Rohati Chooses Octeon.

Rohati Architecture uses OCTEON CN58XX for Multiple Functions of Control, Data, Security and Services to deliver up to 40 Gbps L4-L7 Secure Application Performance

MOUNTAIN VIEW, Calif., July 21, 2008 – Cavium Networks (NASDAQ: CAVM), a leading provider of semiconductor products that enable intelligent processing for networking, communications, storage and wireless applications, today announced that Rohati Systems, a leader in high-performance Network-Based Entitlement Control (NBEC) has utilized multiple Cavium OCTEON Plus MIPS64® CN58XX 4-core to 16-core processors in a highly innovative system architecture as part of its TNS™ Platform to deliver industry-leading performance and features in a cost-effective manner. Cavium Networks' processors are being designed into market-leading networking equipment such as routers, switches, Unified Threat Management appliances, Layer 4+ content-aware switches, modular chassis switches, wireless infrastructure equipment, broadband router and wireless LAN access/aggregation points.
Enterprise Security requirements are rapidly evolving in response to an increasingly dynamic and regulatory governed business climate. Definition and enforcement of these security policies has traditionally been done on a per-application level and through software-only solutions, which carry significant administrative costs and are subject to performance or granularity limitations. The Rohati TNS™ product line delivers for the first time a standards-based, high-performance network-based platform which transparently secures access to data-center resources across all users and applications without requiring client or server side agents thereby dramatically accelerating and simplifying deployment and lowering cost of ownership.
Rohati’s innovative system architecture uses multiple Cavium OCTEON CN58XX 4-core and 16-core processors for different purposes including control-plane, data-plane, security and services acceleration. The system consists of OCTEON processors as the only programmable components connected with a low-latency fabric in appliance and modular-chassis form-factors. These systems deliver a scalable family of networking systems with leading performance, granularity and security for network-based entitlement control at layer 4 to layer 7 performance of up to 40Gbps with 6 Million traffic flows. Rohati’s network-based entitlement control can be transparently deployed in the data center and applied across a broad range of applications and resources including Collaborative application such as Wikis and Microsoft SharePoint, unstructured data store such as CIFS file shares, packaged applications and legacy applications, in companies of all sizes.

Thursday, July 17, 2008

Cavium Networks to Acquire Taiwan-Based Star Semiconductor

MOUNTAIN VIEW, Calif., July 16, 2008 – Cavium Networks (NASDAQ: CAVM), a leading provider of semiconductor products that enable intelligent processing for networking, communications, security and wireless applications, today announced that it is has signed a definitive agreement to acquire certain assets and business of Star Semiconductor Corporation. Star Semiconductor is a Taiwan-based design house in Hsinchu with expertise in building highly integrated ARM-based SOC processors for the broadband, connected home and SOHO market segments. This acquisition will provide Cavium Networks with a highly experienced stand-alone SOC processor team based in Taiwan. The net purchase price of the acquisition will be approximately $9 million in cash. The acquisition is expected to close in the third calendar quarter of 2008.
Cavium's existing OCTEON single- and dual-core processor lines address gateway applications in the broadband market including SOHO/SMB, FTTH and enterprise 802.11n access point applications. This acquisition will enable Cavium to deliver highly optimized, cost effective and low power SOC processors to address a significantly broader range of network connected, triple-play enabled devices for the digitally connected home and office. Cavium Networks plans to continue to ship and sell Star's existing product lines.
"Cavium Networks' technology is enabling intelligent networks around the globe,” said Syed Ali, CEO and President of Cavium Networks, ”adding Star Semiconductor's highly experienced SOC team focused on broadband and network connected devices will enable us to significantly expand our served end markets. We are very excited about the addition of Star Semiconductor to the Cavium family.”
"Star Semiconductor has assembled a proven, highly experienced team of hardware, software and board-designers in Taiwan,” said Steven Huang, Chairman and CEO of Star Semiconductor, ”being based in Taiwan, we have intimate knowledge of application requirements in the broadband and network connected device markets. Working with local customers, we have developed significant core IP for these markets. Future products will combine Cavium and Star's IP to build highly-differentiated, low-power solutions for Cavium’s target markets. We look forward to leveraging Cavium's customer relationships and global sales to proliferate the use of the Star technology world-wide."

Sunday, July 13, 2008

Wireshark "TurboCap"

From : http://erratasec.blogspot.com/


I just noticed that CACEtech is now selling a sniffing adapter "TurboCap for Windows. (CACEtech is the company that funds Wireshark development - and if you are a cybersecurity geek, you should have experience with Wireshark).This product addresses the problem that operating-systems (Windows, Linux, BSD, et al.) are not optimized for sniffing packets. Thus, if you wanted to sniff a fast network with Wireshark, you'd be lucky sniffing at a rate of 300,000 packets per second. This product claims to allow sniffing at 3-million packets per second, which is the max theoretical speed for full-duplex gigabit Ethernet.This product addresses the problem with a custom driver. You cannot use this card for normal networking. Although it physically is a network card, it will not appear as one of the network cards under Windows. It is a special "sniffing" card instead. It will only be available for custom sniffing applications, such as Wireshark.The first product that replaced the network stack with a custom sniffing driver was the Network General "Sniffer"™ back in the 1980s. This is the product that gave us the name "packet-sniffer". It was the first to achieve "wire-speed" sniffing performance.Many sniffing products have since used this idea. I used to work at Network General. When I founded my own company and created the BlackICE intrusion-detection system in 1998, I likewise used this concept. We wrote a custom sniffing driver for the 3c905 hardware. This happened to be the chip used in Dell notebooks of the time. The upshot of this was that my Dell notebook could do wirespeed 100-mbps intrusion-detection while other products at the time struggled at 10-mbps. This was an unbelievable speed back in the day, although custom drivers are more common now, so most intrusion-detection products now support wirespeed.When writing a custom driver, or tweaking the existing drivers for better speed, there are a number of issues that you address.

BUFFER SIZE

Standard network drivers use tiny buffers, often a mere 64k. You want a lot more for a sniffing application. You might allocate a 100-megabyte buffer within the driver for holding packets.

FRAGMENT SIZE

In order to fit variable sized packets into a tiny buffer, most cards will fragment the packets in to 64 byte, 128 byte, or 256 byte chunks. The network driver must then reassemble the fragments back into whole packets before sending them up the network stack. Note that this is a wholly different sort of fragmentation at the hardware level unrelated to the fragmentation that occurs at the TCP/IP level.A good choice for fragment size is 2048 bytes. It's large enough to hold standard Ethernet packets without needing reassembly. Only GigE jumbo frames would need to be reassembled.

POLLING INSTEAD OF INTERUPTS

The operating-system stack is designed so that incoming packets cause a hardware interrupt. This causes the operating system to halt its current task, run the driver code to deal with incoming packet, then resume. Handling an interrupt is efficient if there are few of then (less then 10,000 per second), but extremely inefficient if there are many. Sending 3-million interrupts per second at a typical operating system will cause it to lock up.The alternative solution is "polling", where the software constantly tries to read the next packet. This means that there is no overhead from interrupts if the packets are arriving very fast, but means that the CPU is pegged at 100% utilization even if there is no traffic at all.A hybrid method is to poll on a timer interrupt. In this method, you set up a timed interrupt (such as 10,000 per second), then poll the card to see if any packets have arrived since the last interval.

DATA TRANSFER

There are two ways of getting packets off the network card into memory. The first is "programmed input-output". In this mode, the CPU reads the bytes from the network chip, and then writes them into main memory. The second method is "direct memory access (DMA)". In this method, the network card writes the packets directly to memory, bypassing the CPU.The CPU still needs to be involved with DMA. It must tell the adapter where buffers are in memory. The driver must continually refresh the list of free buffers. Thus, as the code processes incoming packets, it will free up those buffers and send them back to the network hardware for reuse in DMA.

KERNEL-MODE TO USE-MODE TRANSITIONS

In much the same way that handling an interrupt is expensive, there is a lot of overhead in transferring control from driver (which runs in kernel-mode) to the application (which runs in user-mode). You would likewise lock up the system trying to do this for every packet at 3-million packets per second.The trick to get around this is to map the buffer in both kernel space and user space. In this manner, the user-mode sniffing application can read packets directly from the buffer without a kernel-mode transition.

CPU BUDGET

Consider a 3-GHz CPU trying to sniff packets on a full duplex Ethernet at 3-million packets per second. Simple math shows that you have only 1000 CPU cycles per packet. That is your "budget".This budget gets used up pretty fast. The problem isn't necessarily the number of instructions that can be executed (CPUs can execute multiple instructions per cycle), but memory access. The CPU can access a register in 1-cycle, first level cache in 3-cycles, second level cache in 25-cycles, and main memory in 250-cycles. In other words, if the software attempts to read memory, and it's not in the cache (a "cache miss"), it must stop and wait 250-cycles for the data to be read.Thus, a 1000-cycle-per-packet budget equates to a 4-cache-miss-per-packet budget.The packets are DMAed by the driver into memory, but not the cache. Therefore, reading the first byte of a packet will result in a cache miss. This leave only 750-cycles remaining.In addition, the header information (packet length, timestamp, etc.) are located in a different place in memory. This will also cause a cache miss, leaving only 500-cycles left. Multiple CPUs must be synchronized with a full memory access, which has the same cost as a cache miss. This leaves 250-cycles left to process the packet. If you do something like a TCP connection table lookup, you've probably got another cache miss. These leaves 0-cycles left to process the packet.Thus, we've quickly exceeded our CPU budget without actually doing anything.With most drivers, you can locate the packet headers with the packet data. By combining them into the same location, they can be read together without a separate cache miss. CPU's have a pre-fetch instruction. The packet-read API can be implemented so that whenever the software reads the current packet, it "pre-fetches" the next packet into cache. Thus, the headers and data will be available next time without a cache miss.If you use a ring buffer and a producer-consumer relationship, you won't need to use a traditional memory lock to synchronize the driver with the application. If you are even more clever with your sniffing API, you pre-fetch several future packets, and you allow the application to peek at the next packet allowing it to pre-fetch its TCP connection entry.Putting this all together, I've proven that you'll need at least 4-cache misses to process a packet, and thus handling 3-million packets per second is impossible. Then, I've shown tricks to show how you can get around this and process a packet without any cache misses.

OTHER TRICKS

There are a long list of other optimizations you can do. For example, you'll want to align your buffers on cache-line boundaries. You'll also want to set the processor affinity flags so that the driver uses one CPU core while the user-mode process uses the remaining cores.

CONCLUSION

CACEtech claims "wirespeed" performance, which implies 3-million packets-per-second. I don't know if they've implemented all these tricks. Their cards are a little pricey ($900 each), so I'm not willing to buy one just to play with it. However, for anybody running network tools like Wireshark or Snort, they should logically give a huge boost in performance.

DNS Vulnerability


Posted 7/8/08 by Robert from the 'UDP 4 lyfe' departmentA pretty nasty DNS vulnerability has been discovered in 81 products by Dan Kaminsky. This vulnerability type seems to be the same described by Amit Klein and involves abusing the PRNG involved in transactions on DNS queries. Long story short if you run a vulnerable caching DNS server you can have your cache poisoned. From CERT"The DNS protocol specification includes a transaction ID field of 16 bits. If the specification is correctly implemented and the transaction ID is randomly selected with a strong random number generator, an attacker will require, on average, 32,768 attempts to successfully predict the ID. Some flawed implementations may use a smaller number of bits for this transaction ID, meaning that fewer attempts will be needed. Furthermore, there are known errors with the randomness of transaction IDs that are generated by a number of implementations. Amit Klein researched several affected implementations in 2007."
The Problem
The root cause is a fundamental, well known, weakness in the DNS protocol. DNS uses UDP, a stateless protocol. A DNS server will send a request in a single UDP packet, then wait for a response to come back. In order to match request and response, a number of parameters are checked:who sent the response? Was it the DNS server we sent the request to? for this particular response, do we have an outstanding request? each request uses a unique and random query ID. The response has to use the same query ID. The response has to be sent to the same port from which the request was sent. Only if all this matches, the response is accepted. The first valid response wins. If an attacker is able to guess the query id and the source port, the attacker is able to send a fake response, which will be cached by the DNS server.How likely is it to "guess" the query id and the source port? One would think, its not that easy. The query ID is 16 bits long, allowing for 65536 options. The source port could be anything above 1024 which again would allow for another 64512 options. It is easy to guess which DNS server is expected to reply, as it has to be a valid DNS server for the respective domain. A reasonable DNS server should respond in less then a second, allowing for about 1 second to send the spoofed response.At least for BIND, the source port only changes whenever you restart it. Once restarted it keeps using the same source port.Ideally, one would think that it would take millions of packets per second to successfully spoof the response. However, the problem is in the details. A DNS server can not use any port to source the query. It may not use a port commonly used by outbound connections, or a port reserved by a server. This is an issue attacked by today's patches. As of today, DNS servers used a rather small set of ports to source requests. This is the actual new finding. The patch will increase the pool of source ports available to DNS queries. To make things worse: the real DNS server may be silenced using DDoS attacks.Over the past few months, we had a couple patches (again both for Microsoft as well as for BIND) addressing the randomness of the query ID.

How bad is it?
If you run a caching DNS server, patch it soon. I wouldn't say "today, while ignoring sane patch management". But check with your vendor and follow their guidance. The world is not going to end today. It will in fact end in 2 1/2 years from today (different story ;-) ). But this is something you have to fix soon. Right now, the US-CERT advisory lists a handful of vulnerable products and quite a few "unknowns".
Eventually we all may have to break down and fix DNS. DNSSEC is an extension to DNS asking for cryptographic authentication. However, it requires a PKI infrastructure which at this point doesn't exist. There is not much to be gained from implementing DNSSEC on your own (but by all means: try it out and see how it works).
One thing to carefully test is your firewall. We already heard about issues with Zonealarm and MS08-038. However, it is possible that other firewalls will think that something is wrong if your DNS server all for sudden keeps jumping ports.
Where can I find out more?
CERT:www.kb.cert.org/vuls/id/800113
Internet Software Consortium (BIND): http://www.isc.org/sw/bind/bind-security.php
Dan Kaminski on Martin McKeay's Podcast: http://media.libsyn.com/media/mckeay/nsp-070808-ep111.mp3
DNSSEC resources:
DNSSEC Overview: http://www.dnssec.org
DNSSEC Deployment Initiative: http://www.dnssec-deployment.org
DNSSEC HowTo: http://www.nlnetlabs.nl/dnssec_howto
-----
UPDATE:
The CERT announcement implies strong randomization of the source port and transaction id makes the attack improbable. "Because attacks against these vulnerabilities all rely on an attacker's ability to predictably spoof traffic, the implementation of per-query source port randomization in the server presents a practical mitigation"
isc.org warns busy DNS resovlers could be impacted by their patch so they recommend the beta release version.
"The patches will have a noticeable impact on the performance of BIND caching resolvers with query rates at or above 10,000 queries per second. The beta releases include optimized code that will reduce the impact in performance to non-significant levels."
Johannes B. Ullrich, Ph.D. SANS Technology Institute - http://www.sans.edu

IDS/IPS Info

Some links to IPS Signatures/Informations.
http://www.emergingthreats.net/rules/emerging-web_sql_injection.rules
http://wiki.intoto.com/intoto_wiki/tiki-index.php?page=IntruPro-IPS

XSS/Cross Site Scripting :
http://www.cgisecurity.com/articles/xss-faq.shtml

LFI/RFI attacks - File Inclusion attacks.( Local/Remote File inclusion)

Wednesday, July 9, 2008

Snap the Alliance !

Adaptec sells Snap Server networked, desktop storage appliances business to Overland Storage
MILPITAS, Calif. (AP) --
Data storage provider Adaptec Inc. said Monday it sold its Snap Server networked and desktop storage appliances business to Overland Storage Inc. for $3.6 million.
Adaptec will retain ownership of all iSCSI-based hardware and software products and assets, which will be rebranded and managed by Adaptec.

"The sale of the Snap Server business allows us to focus on strengthening our leadership position in the Unified Serial RAID controller business, leverage our iSCSI assets and continue to streamline the company's operations," said S. Sundaresh, president and chief executive of Adaptec, in a statement.
Overland Storage will take control of all existing Snap Server networked and desktop storage appliance assets including licenses, patents and existing product inventory, and assume customer support obligations.
About 50 Adaptec workers will receive offers to join Overland Storage effective immediately, and will remain in the same facility in Milpitas, Calif., which Overland is subleasing from Adaptec.

Wednesday, June 25, 2008

Storage Networking

The Storage Networking Market has traditionally been driven by the Fibre Channel equipments and product offerings.The market from then onwards has been a fight between the companies championing the cause of the Fibre Channel and the more recent proponents of iSCSI/IP Storage technologies. Companies like Adaptec (Acquisition of Platys Communications) , Alacritech, Qlogic, LSI Logic came out with their offering on iSCSI/TOE Offloaded PCI/PCI-X NIC Cards. iSCSI as a protocol was pushed strongly by Cisco/IBM. (Julian Satran , You listening ?? ) The product offering of all these companies fall into the Storage Initiator Segment, Storage Target Segment and as a switch in between (Remember Maranti Networks ??). The IP Storage market has thus far been very slow in developing and has been growing at a snails pace. Braodcom too has iSCSI chips.Of late Cavium with it's latest Octeon General purpose processor have been trying to address the IP Storage Vertical. How would they fare , given the fact that there are already existing BIG storage players ? This remains a question which their Storage Clientele would only tell.

Octeon Powers Palo Alto Networks - Yahoo News

Cavium Networks OCTEON Powers Palo Alto Networks' PA-4000 Series, Best of Interop Grand Prize Winner

MOUNTAIN VIEW, CA--(MARKET WIRE)--May 20, 2008 -- Cavium Networks (NasdaqGM:CAVM - News), a leading provider of semiconductor products that enable intelligent processing for networking, communications, storage, wireless and security applications, today announced that Palo Alto Networks uses Cavium's OCTEON Multi-core MIPS64 processors to power its entire series of next generation firewall systems. Palo Alto Networks' PA-4000 Series next generation firewall won the highly coveted Interop Grand Prize Award as well as the Best of Show award in the security category at the recent Interop 2008 in Las Vegas. Cavium Networks' processors are being designed into market-leading networking equipment such as routers, switches, Unified Threat Management appliances, Layer 4+ content-aware switches, modular chassis switches, wireless infrastructure equipment, broadband router and wireless LAN access/aggregation points. "Palo Alto Networks is delivering a new class of security equipment which enables unprecedented visibility and policy control of applications running on enterprise networks regardless of port, protocol, evasive tactic or even SSL encryption -- at up to 10Gbps with no performance degradation," said Nir Zuk, founder and CTO of Palo Alto Networks. "We selected Cavium's OCTEON processor family from a number of options due to its leading performance, unmatched hardware acceleration, top-to-bottom scalability and lower power. Furthermore, Cavium's strong market momentum and processor roadmap execution make Cavium an ideal long term silicon partner for us."
"Cavium processors are becoming the CPU of choice for a wide range of applications in networking, security, storage and wireless equipment. Cavium's blue chip customer base is leveraging our highly integrated System on Chip multi core processors and targeted hardware acceleration for packet processing, security and intelligent Layer 4 to Layer 7 processing to produce bench mark setting world class products. We congratulate Palo Alto Networks on winning this prestigious award," said Rajiv Khemani, Vice President Marketing and Sales of Cavium Networks.

Quantum Level Jump ?

Cisco has come out of late with it's latest Quantumflow network processor. What Impact will it have in the market ? What would Cavium do ? It's only the ensuing times which will tell.
The link provides the detail.

http://www.cisco.com/en/US/prod/collateral/routers/ps9343/solution_overview_c22-448936.html

Palo Alto Networks.

This company has been making the headlines over the last few months as they have a good product offering which has made quite an impact in the recently concluded RSA 2008.They seem to have what we know as the next generation firewall. Below is the extract from a web site.
Palo Alto Networks is marketing what it calls next-generation firewalls to address the problems described in the report. But the research itself looks quite solid. It is based not on surveys of people but on a study of network traffic at 20 large companies and government agencies over the last six months. Using its software, Palo Alto Networks monitored the computer behavior of more than 350,000 users. The company has pledged to update and publish the results every six months.
Many companies try to block access to peer-to-peer file-sharing services, but programs used to access these services were found at 90 percent of the companies studied. The most popular were eMule and BitTorrent, which are used to share music, movies and software.

Unauthorized proxies, or software agents that disguise applications, were found on 80 percent of the corporate networks. These can be used for corporate espionage or pilfering trade secrets.
Google applications like Google Docs and Google Desktop were used in 60 percent of the corporations studied. And, no surprise, Internet video services like YouTube were consuming large portions of network bandwidth at all the companies.
One conclusion, the report notes, is that users are routinely, and fairly easily, circumventing corporate security controls. And that is because traditional firewall technology was not meant to grapple with the diversity of Internet applications of recent years.
“We see every enterprise leaking from the inside out,” said Dave Stevens, chief executive of Palo Alto Networks.
But the answer, it seems, is not a draconian crackdown on all Internet applications, but a more fine-grained monitoring and sorting of what applications can play in corporate networks and under what ground rules. After all, many Internet applications are seen as vital tools of productivity, collaboration and innovation — the stuff of Enterprise 2.0 companies.
Take Google Desktop, Mr. Stevens noted. It is a great productivity tool for users to quickly search by topic for the nuggets of information buried in their computer files and information. But companies, he said, are deeply uneasy about the indexing feature that links desktop searches back to Google’s computer servers, and the prospect of their corporate data being indexed by Google.
“But companies don’t want to block Google Desktop, they want to use it securely,” Mr. Stevens said. In this case, he explained, the solution is to be able to turn off the link back to Google’s servers. And in general, he added, the answer is for corporations to have that sort of granular control over the new wave of Internet applications
.

Tuesday, June 24, 2008

Rohati - An extract from Allen Shimel's Blog



The best way for me to describe Rohati is that it is layer 7 ACLs to control access to applications. Where we already have security at the perimeter and at the edge, Rohati is about controlling access at the server/application. The diagram on the left (click on it to get a bigger version), is a good illustration of how Rohati works. By integrating with LDAPs Rohati can assign you an access policy to any application. Based upon that Rohati gives a very fine grain level of access control at the application layer. It acts as a proxy to the app server for both regular and encrypted traffic. Because the ACLs are on the Rohati box itself, there really is not any integration with switches per say and so no integration worries.
The only problem is that the Rohati box has to be able to handle the traffic flow. Hence the box is a big honker. The cheap one is about 20k list I believe and the industrial size version is 80k. This product is aimed squarely at the data center space and is sold through channels.
Will Rohati succeed. Yes, I think it will. I think they have taken a unique approach to a security issue that will continue to grow in years to come. Application access is an area that I think is still up and coming. In a period of nothing is ever new in security, the Rohati team seems to have found something that has not been done before in a packaged dedicated way like this. If nothing else, with all of the ex-Cisco folks there, Cisco will eat its young and buy the technology back in.

NAC Market Trend - From a Friend's Post



I am sure many of you, who are working or worked with NAC vendors, would love to hear this. After a lot of talk about NAC market being dead, Infonetics has taken a fresh view of NAC market and predicts strong forecast ahead. Ref: Reports of NAC’s death have been greatly exaggerated; market up 16% in 1Q08
According to the research report, NAC market jumped 16% in 1Q08 to $62.7 million which means $10 million more over the previous quarter.
Though NAC market is still dominated by out-of-band appliances mainly from Cisco and Juniper, Infonetics predicts shift towards Ethernet switch based NAC appliances and in-line (bump in the wire) products. It predicts that purpose-built products from Consentry Networks and Nevis Networks will make up 25% of the NAC market. Being a Nevis employee, I am really happy to know this and wish that it happens!!

FireFox 3 goes live.


Early server issues did little to dampen an enthusiastic response on Tuesday for the release of the latest version of Mozilla's Web browser, Firefox 3.
The browser, released at 1 p.m. EDT on Tuesday, uses less memory and adds one-click bookmarking, better suggestions for sought-after Web sites, and features to help Web surfers avoid malicious software. Rival Opera released their latest browser last week, boasting a similar security feature, as does Microsoft's next browser Internet Explorer 8, which is still in beta.
By Tuesday afternoon, Mozilla stated that about 14,000 people were downloading the software every minute. The demand caused server problems in the early afternoon, according to the company.
"This will put us well into the tens of millions of downloads in a 24 hour period if we can sustain it," the company said in a statement. "Each download is about 7MB so that’s around 13 Gigabits/s of just download traffic. Not too shabby!"
External attackers have increasingly focused on the browser as a vector through which to attack unsuspecting users' computers. Among the most popular techniques, attackers compromise legitimate Web servers with code to redirect the Web site's visitors to servers hosting malicious code. Anti-malware builds on the anti-phishing features that all three browser makers incorporated into their software last year.
Mozilla had publicized the release, asking users to sign up to download the product in an attempt to set a worldwide record for the most downloads in 24 hours.

Cavium going great guns !

The company I worked for about an year before joinng Nevis is doing great. Cavium is gaining a strong foothold in the Security Market with it's latest Octeon Processor Family. Octeon being a general purpose processor caters to verticals like Storage, Security, Wireless,Data centre etc. Cavium's share value has been going up since the IPO last year May.
Share Price of Cavium : http://finance.yahoo.com/q?s=CAVM

Introduction

Hi , I am Kaushik Datta, a person devoted to understanding the nuances of the computer networking industry.I am passionate about networking and interested in new trends in the market , new products and new companies. Having started my career with Cisco Systems (HCL Technologies-Cisco Development Centre), I was always interested about networking as a subject of research and the contributions it can bring to the world community.I have subsequently worked with Adaptec, Cavium Networks and Nevis Networks. This blog is an effort to publish recent trends and happenings in the industry trends and ways of things to come in future. I also with my limited knowledge try to corelate such events and post my views.