Chapter 6: The Five Nines Concept

Organizations that want to maximize the availability of their systems and data may take extraordinary measures to minimize or eliminate data loss. The goal is to minimize the downtime of mission critical processes. If employees cannot perform their regular duties, the organization is in jeopardy of losing revenue.

Organizations measure availability by percentage of uptime. This chapter begins by explaining the concept of five nines. Many industries must maintain the highest availability standards because downtime might literally mean a difference between life and death.

This chapter discusses various approaches that organizations can take to help meet their availability goals. Redundancy provides backup and includes extra components for computers or network systems to ensure the systems remain available. Redundant components can include hardware such as disk drives, servers, switches, and routers or software such as operating systems, applications, and databases. The chapter also discusses resiliency, the ability of a server, network, or data center to recover quickly and continue operation.

Organizations must be prepared to respond to an incident by establishing procedures that they follow after an event occurs. The chapter concludes with a discussion of disaster recovery and business continuity planning which are both critical in maintaining availability to an organization’s resources.

What Does the Five Nines Mean?

Five nines mean that systems and services are available 99.999% of the time. It also means that both planned and unplanned downtime is less than 5.26 minutes per year. The chart in the figure provides a comparison of the downtime for various availability percentages.

High availability refers to a system or component that is continuously operational for a given length of time. To help ensure high availability:

  • Eliminate single points of failure
  • Design for reliability
  • Detect failures as they occur

Sustaining high availability at the standard of five-nines can increase costs and utilize many resources. The increased costs are due to the purchase of additional hardware such as servers and components. As an organization adds components, the result is an increase in configuration complexity. Unfortunately, increased configuration complexity increases the risk factors. The more moving parts involved, the higher the likelihood of failed components.


Environments that Require Five Nines

Although the cost of sustaining high availability may be too costly for some industries, several environments require five nines.

  • The finance industry needs to maintain high availability for continuous trading, compliance, and customer trust. Click here to read about the four-hour outage on the New York Stock Exchange in 2015.
  • Healthcare facilities require high availability to provide around-the-clock care for patients. Click here to read about the average costs incurred for data center downtime in the healthcare industry.
  • The public safety industry includes agencies that provide security and services to a community, state, or nation. Click here to read about a network outage at the U.S. Pentagon Police Agency.
  • The retail industry depends on efficient supply chains and the delivery of products to customers. Disruption can be devastating, especially during peak demand times such as holidays.
  • The public expects that the news media industry communicate information on events as they happen. The news cycle is now around the clock, 24/7.

Threats to Availability

The following threats pose a high risk to data and information availability:

  • An unauthorized user successfully penetrates and compromises an organization’s primary database
  • A successful DoS attack significantly affects operations
  • An organization suffers a significant loss of confidential data
  • A mission-critical application goes down
  • A compromise of the Admin or root user occurs
  • The detection of a cross-site script or illegal file server share
  • The defacement of an organization’s website impacts public relations
  • A severe storm such as a hurricane or tornado
  • A catastrophic event such as a terrorist attack, building bombing, or building fire
  • Long-term utility or service provider outage
  • Water damage as the result of flooding or sprinkler failure

Categorizing the impact level for each threat helps an organization realize the dollar impact of a threat. Click the threat categories in the figure to see an example of each.


Designing High Availability System

High availability incorporates three major principles to achieve the goal of uninterrupted access to data and services:

  1. Elimination or reduction of single-points of failure
  2. System Resiliency
  3. Fault Tolerance

It is important to understand the ways to address a single point of failure. A single point of failure can include central routers or switches, network services, and even highly skilled IT staff. The key is that a loss of the system, process, or person can have a very disruptive impact on the entire system. The key is to have processes, resources, and components that reduce single points of failure. High availability clusters is one way to provide redundancy. These clusters consist of a group of computers that have access to the same-shared storage and have identical network configurations. All servers take part in processing a service simultaneously. From the outside, the server group looks like one device. If a server within the cluster fails, the other servers continue to process the same service as the failed device.

Systems resiliency refers to the capability to maintain availability of data and operational processing despite attacks or disrupting event. Generally, this requires redundant systems, in terms of both power and processing, so that should one system fail, the other can take over operations without any break in service. System resiliency is more than hardening devices; it requires that both data and services be available even when under attack.


Fault tolerance enables a system to continue to operate if one or more components fail. Data mirroring is one example of fault tolerance. Should a "fault" occur, causing disruption in a device such as a disk controller, the mirrored system provides the requested data with no apparent interruption in service to the user.

Asset Identification

An organization needs to know what hardware and software are present as a prerequisite to knowing what the configuration parameters need to be. Asset management includes a complete inventory of hardware and software.

This means that the organization needs to know all of components that can be subject to security risks, including:

  • Every hardware system
  • Every operating system
  • Every hardware network device
  • Every network device operating system
  • Every software application
  • All firmware
  • All language runtime environments
  • All individual libraries

An organization may choose an automated solution to keep track of assets. An administrator should investigate any changed configuration because it may mean that the configuration is not up-to-date. It can also mean that unauthorized changes are happening.


Asset Classification

Asset classification assigns all resources of an organization into a group based on common characteristics. An organization should apply an asset classification system to documents, data records, data files, and disks. The most critical information needs to receive the highest level of protection and may even require special handling.

An organization can adopt a labeling system according to how valuable, how sensitive, and how critical the information is. Complete the following steps to identify and classify the assets of an organization:

  1. Determine the proper asset identification category.
  2. Establish asset accountability by identifying the owner for all information assets and application software.
  3. Determine the criteria for classification.
  4. Implement a classification schema.

For example, the U.S. government uses sensitivity to classify data as follows: top secret; secret; confidential; public trust; and unclassified.

Asset Standardization

Asset management manages the lifecycle and inventory of technology assets including devices and software. As part of an IT asset management system, an organization specifies the acceptable IT assets that meet its objectives. This practice effectively reduces the different asset types. For example, an organization will only install applications that meet its guidelines. When administrators eliminate applications that do not meet the guidelines, they are effectively increasing security.

Asset standards identify specific hardware and software products that the organization uses and supports. When a failure occurs, prompt action helps to maintain both access and security. If an organization does not standardize its hardware selection, personnel may need to scramble to find a replacement component. Non-standard environments require more expertise to manage and they increase the cost of maintenance contracts and inventory. Click here to read about how the military shifted to standards-based hardware for its military communications.

Threat Identification

The United States Computer Emergency Readiness Team (US-CERT) and the U.S. Department of Homeland Security sponsor a dictionary of common vulnerabilities and exposure (CVE). CVE contains a standard identifier number with a brief description, and references to related vulnerability reports and advisories. The MITRE Corporation maintains the CVE List and its public website.

Threat identification begins with the process of creating a CVE Identifier for publicly known cybersecurity vulnerabilities. Each CVE Identifier includes the following:

  • The CVE Identifier number
  • A brief description of the security vulnerability
  • Any important references

Click here to learn more about CVE Identifier.

Risk Analysis

Risk analysis is the process of analyzing the dangers posed by natural and human-caused events to the assets of an organization.

A user performs an asset identification to help determine which assets to protect. A risk analysis has four goals:

  • Identify assets and their value
  • Identify vulnerabilities and threats
  • Quantify the probability and impact of the identified threats
  • Balance the impact of the threat against the cost of the countermeasure

There are two approaches to risk analysis.

Quantitative Risk Analysis

A quantitative analysis assigns numbers to the risk analysis process (Figure 1). The asset value is the replacement cost of the asset. The value of an asset can also be measured by the income gained through use of the asset. The exposure factor (EF) is a subjective value expressed as a percentage that an asset loses due to a particular threat. If a total loss occurs, the EF equals 1.0 (100%). In the quantitative example, the server has an asset value of $15,000. When the server fails, a total loss occurs (the EF equals 1.0). The asset value of $15,000 multiplied by the exposure factor of 1 results in a single loss expectancy of $15,000.

The annualized rate of occurrence (ARO) is the probability that a loss will occur during the year (also expressed as a percentage). An ARO can be greater than 100% if a loss can occur more than once a year.

The calculation of the annual loss expectancy (ALE) gives management some guidance on what it should spend to protect the asset.

Qualitative Risk Analysis

Qualitative Risk Analysis uses opinions and scenarios. Figure 2 provides an example of table used in qualitative risk analysis, which plots the likelihood of a threat against its impact. For example, the threat of a server failure may be likely, but its impact may only be marginal.

A team evaluates each threat to an asset and plots it in the table. The team ranks the results and uses the results as a guide. They may determine to take action on only threats that fall within the red zone.

The numbers used in the table do not directly relate to any aspect of the analysis. For example, a catastrophic impact of 4 is not twice as bad as a marginal impact of 2. This method is subjective in nature.


Mitigation

Mitigation involves reducing the severity of the loss or the likelihood of the loss from occurring. Many technical controls mitigate risk including authentication systems, file permissions, and firewalls. Organization and security professionals must understand that risk mitigation can have both positive and negative impact on the organization. Good risk mitigation finds a balance between the negative impact of countermeasures and controls and the benefit of risk reduction. There are four common ways to reduce risk:

  • Accept the risk and periodically re-assess
  • Reduce the risk by implementing controls
  • Avoid the risk by totally changing the approach
  • Transfer the risk to a third party

A short-term strategy is to accept the risk necessitating the creation of contingency plans for that risk. People and organizations have to accept risk on a daily basis. Modern methodologies reduce risk by developing software incrementally and providing regular updates and patches to address vulnerabilities and misconfigurations.

Outsourcing services, purchasing insurance, or purchasing maintenance contracts are all examples of risk transfer. Hiring specialists to perform critical tasks to reduce risk can be a good decision and yield greater results with less long term investment. A good risk mitigation plan can include two or more strategies.


Activity - Perform an Asset Risk Analysis


Layering

Defense in depth will not provide an impenetrable cyber shield, but it will help an organization minimize risk by keeping it one-step ahead of cyber criminals.


If there is only one defense in place to protect data and information, cyber criminals have only to get around that single defense. To make sure data and information remains available, an organization must create different layers of protection.

A layered approach provides the most comprehensive protection. If cyber criminals penetrate one layer, they still have to contend with several more layers with each layer being more complicated than the previous.

Layering is creating a barrier of multiple defenses that coordinate together to prevent attacks. For example, an organization might store its top secret documents on a server in a building surrounded by an electronic fence.

Limiting

Limiting access to data and information reduces the possibility of a threat. An organization should restrict access so that users only have the level of access required to do their job. For example, the people in the marketing department do not need access to payroll records to perform their jobs.

Technology-based solutions such as using file permissions are one way to limit access; an organization should also implement procedural measures. A procedure should be in place that prohibits an employee from removing sensitive documents from the premises.

Diversity

If all of the protected layers were the same, it would not be very difficult for cyber criminals to conduct a successful attack. Therefore, the layers must be different. If cyber criminals penetrate one layer, the same technique will not work on all of the other layers. Breaching one layer of security does not compromise the whole system. An organization may use different encryption algorithms or authentication systems to protect data in different states.

To accomplish the goal of diversity, organizations can use security products manufactured by different companies for multifactor authentication. For example, the server containing the top secret documents is in a locked room that requires a swipe card from one company and biometric authentication supplied by another company.

Obscurity

Obscuring information can also protect data and information. An organization should not reveal any information that cyber criminals can use to figure out what version of the operating system a server is running or the type of equipment it uses. For example, error messages should not contain any details that cyber criminals could use to determine what vulnerabilities are present. Concealing certain types of information makes it more difficult for cyber criminals to attack a system.

Simplicity

Complexity does not necessarily guarantee security. If an organization implements complex systems that are hard to understand and troubleshoot, it may actually backfire. If employees do not understand how to configure a complex solution properly, it may make it just as easy for cyber criminals to compromise those systems. To maintain availability, a security solution should be simple from the inside, but complex on the outside.

Activity - Identify the Layer of Defense


Single Points of Failure

A single point of failure is a critical operation within the organization. Other operations may rely on it and failure halts this critical operation. A single point of failure can be a special piece of hardware, a process, a specific piece of data, or even an essential utility. Single points of failure are the weak links in the chain that can cause disruption of the organization's operations. Generally, the solution to a single point of failure is to modify the critical operation so that it does not rely on a single element. The organization can also build redundant components into the critical operation to take over the process should one of these points fail.

N+1 Redundancy

N+1 redundancy ensures system availability in the event of a component failure. Components (N) need to have at least one backup component (+1). For example, a car has four tires (N) and a spare tire in the trunk in case of a flat (+1).

In a data center, N+1 redundancy means that the system design can withstand the loss of a component. The N refers to many different components that make up the data center including servers, power supplies, switches, and routers. The +1 is the additional component or system that is standing by ready to go if needed.

An example of N+1 redundancy in a data center is a power generator that comes online when something happens to the main power source. Although an N+1 system contains redundant equipment, it is not a fully redundant system.

RAID

A redundant array of independent disks (RAID) combines multiple physical hard drives into a single logical unit to provide data redundancy and improve performance. RAID takes data that is normally stored on a single disk and spreads it out among several drives. If any single disk is lost, the user can recover data from the other disks where the data also resides.

RAID can also increase the speed of data recovery. Using multiple drives will be faster retrieving requested data instead of relying on just one disk to do the work.

A RAID solution can be either hardware-based or software-based. A hardware-based solution requires a specialized hardware controller on the system that contains the RAID drives. The following terms describe how RAID stores data on the various disks:

  • Parity - Detects data errors.
  • Striping - Writes data across multiple drives.
  • Mirroring - Stores duplicate data on a second drive.

There are several levels of RAID available as shown in the figure.


Click here to view a RAID level tutorial that explains RAID technology.

Spanning Tree

Redundancy increases the availability of the infrastructure by protecting the network from a single point of failure, such as a failed network cable or a failed switch. When designers build physical redundancy in to a network, loops and duplicate frames occur. Loops and duplicate frames have severe consequences for a switched network.

Spanning Tree Protocol (STP) addresses these issues. The basic function of STP is to prevent loops on a network when switches interconnect via multiple paths. STP ensures that redundant physical links are loop-free. It ensures that there is only one logical path between all destinations on the network. STP intentionally blocks redundant paths that could cause a loop.

Blocking the redundant paths is critical to preventing loops on the network. The physical paths still exist to provide redundancy, but STP disables these paths to prevent the loops from occurring. If a network cable or switch fails, STP recalculates the paths and unblocks the necessary ports to allow the redundant path to become active.

Click Play in the figure to view STP when a failure occurs:

  • PC1 sends a broadcast out onto the network.
  • The trunk link between S2 and S1 fails, resulting in disruption of the original path.
  • S2 unblocks the previously blocked port for Trunk2 and allows the broadcast traffic to traverse the alternate path around the network, permitting communication to continue.
  • If the link between S2 and S1 comes back up, STP again blocks the link between S2 and S3.

Router Redundancy

The default gateway is typically the router that provides devices access to the rest of the network or to the Internet. If there is only one router serving as the default gateway, it is a single point of failure. The organization can choose to install an additional standby router.

In Figure 1, the forwarding router and the standby router use a redundancy protocol to determine which router should take the active role in forwarding traffic. Each router is configured with a physical IP address and a virtual router IP address. End devices use the virtual IP address as the default gateway. The forwarding router is listening for traffic addressed to 192.0.2.100. The forwarding router and the standby router use their physical IP addresses to send periodic messages. The purpose of these messages is to make sure both are still online and available. If the standby router no longer receives these periodic messages from the forwarding router, the standby router will assume the forwarding role, as shown in Figure 2.


The ability of a network to dynamically recover from the failure of a device acting as a default gateway is known as first-hop redundancy.

Router Redundancy Options

The following list defines the options available for router redundancy based on the protocol that defines communication between network devices:
  • Hot Standby Router Protocol (HSRP) - HSRP provides high network availability by providing first-hop routing redundancy. A group of routers use HSRP for selecting an active device and a standby device. In a group of device interfaces, the active device is the device that routes packets; the standby device is the device that takes over when the active device fails. The function of the HSRP standby router is to monitor the operational status of the HSRP group and to quickly assume packet-forwarding responsibility if the active router fails.
  • Virtual Router Redundancy Protocol (VRRP) - A VRRP router runs the VRRP protocol in conjunction with one or more other routers attached to a LAN. In a VRRP configuration, the elected router is the virtual router master, and the other routers act as backups, in case the virtual router master fails.
  • Gateway Load Balancing Protocol (GLBP) - GLBP protects data traffic from a failed router or circuit, like HSRP and VRRP, while also allowing load balancing (also called load sharing) between a group of redundant routers.

Location Redundancy

An organization may need to consider location redundancy depending on its needs. The following outlines three forms of location redundancy.

Synchronous
  • Synchronizes both locations in real time
  • Requires high bandwidth
  • Locations must be close together to reduce latency
Asynchronous Replication
  • Not synchronized in real time but close to it
  • Requires less bandwidth
  • Sites can be further apart because latency is less of an issue
Point-in-time-Replication
  • Updates the backup data location periodically
  • Most bandwidth conservative because it does not require a constant connection
The correct balance between cost and availability will determine the correct choice for an organization.

Resilient Design

Resiliency is the methods and configurations used to make a system or network tolerant of failure. For example, a network can have redundant links between switches running STP. Although STP does provide an alternate path through the network if a link fails, the switchover may not be immediate if the configuration is not optimal.

Routing protocols also provide resiliency, but fine-tuning can improve the switchover so that network users do not notice. Administrators should investigate non-default settings in a test network to see if they can improve network recovery times.

Resilient design is more than just adding redundancy. It is critical to understand the business needs of the organization, and then incorporate redundancy to create a resilient network.

Application Resilience

Application resilience is the application’s ability to react to problems in one of its components while still functioning. Downtime is due to failures caused by application errors or infrastructure failures. An administrator will eventually need to shut down applications for patching, version upgrades, or to deploy new features. Downtime can also be the result of data corruption, equipment failures, application errors, and human errors.

Many organizations try to balance out the cost of achieving the resiliency of application infrastructure with the cost of losing customers or business due to an application failure. Application high availability is complex and costly. The figure shows three availability solutions to address application resilience. As the availability factor of each solution increases, the complexity and cost also increase.

IOS Resilience

The Interwork Operating System (IOS) for Cisco routers and switches include a resilient configuration feature. It allows for faster recovery if someone maliciously or unintentionally reformats flash memory or erases the startup configuration file. The feature maintains a secure working copy of the router IOS image file and a copy of the running configuration file. The user cannot remove these secure files also known as the primary bootset.

The commands shown in the figure secure the IOS image and running configuration file.

Preparation

Incident response is the procedures that an organization follows after an event occurs outside the normal range. A data breach releases information to an untrusted environment. A data breach can occur as the result of an accidental or intentional act. A data breach occurs anytime an unauthorized person copies, transmits, views, steals, or accesses sensitive information.

When an incident occurs, the organization must know how to respond. An organization needs to develop an incident response plan and put together a Computer Security Incident Response Team (CSIRT) to manage the response. The team performs the following functions:
  • Maintains the incident response plan
  • Ensures its members are knowledgeable about the plan
  • Tests the plan
  • Gets management’s approval of the plan
The CSIRT can be an established group within the organization or an ad hoc one. The team follows a set of predetermined steps to make sure that their approach is uniform and that they do not skip any steps. National CSIRTs oversee incident handling for a country.

Detection and Analysis

Detection starts when someone discovers the incident. Organizations can purchase the most sophisticated detection systems; however, if administrators do not review the logs and monitor alerts, these systems are worthless. Proper detection includes how the incident occurred, what data it involved, and what systems it involved. Notification of the breach goes out to senior management and managers responsible for the data and systems to involve them in the remediation and repair. Detection and analysis includes the following:
  • Alerts and notifications
  • Monitoring and follow-up
Incident analysis helps to identify the source, extent, impact, and details of a data breach. The organization may need to decide if it needs to call in a team of experts to conduct the forensics investigation.

Containment and Eradication, and Recovery

Containment efforts include the immediate actions performed such as disconnecting a system from the network to stop the information leak.

After identifying the breach, the organization needs to contain and eradicate it. This may require additional downtime for systems. The recovery stage includes the actions that the organization needs to take in order to resolve the breach and restore the systems involved. After remediation, the organization needs to restore all systems to their original state before the breach.

Post-Incident Follow-Up

After restoring all operations to a normal state, the organization should look at the cause of the incident and ask the following questions:
  • What actions will prevent the incident from reoccurring?
  • What preventive measures need strengthening?
  • How can it improve system monitoring?
  • How can it minimize downtime during the containment, eradication, and recovery phases?
  • How can management minimize the impact to the business?
A look at the lessons learned can help the organization to better prepare by improving upon its incident response plan.

Activity - Order the Incident Response Phases


Network Admission Control

The purpose of Network Admission Control (NAC) allows authorized users with compliant systems access to the network. A compliant system meets all of the policy requirements of the organization. For example, a laptop that is part of a home wireless network may not be able to connect remotely to the corporate network. NAC evaluates an incoming device against the policies of the network. NAC also quarantines the systems that do not comply and manages the remediation of noncompliant systems.

A NAC framework can use the existing network infrastructure and third-party software to enforce the security policy compliance for all endpoints. Alternately, a NAC appliance controls network access, evaluates compliance, and enforces security policy. Common NAC systems checks include:
  1. Updated virus detection
  2. Operating systems patches and updates
  3. Complex password enforcement

Intrusion Detection Systems

Intrusion Detection Systems (IDSs) passively monitor the traffic on a network. The figure shows that an IDS-enabled device copies the traffic stream and analyzes the copied traffic rather than the actual forwarded packets. Working offline, it compares the captured traffic stream with known malicious signatures, similar to software that checks for viruses. Working offline means several things:
  • IDS works passively
  • IDS device is physically positioned in the network so that traffic must be mirrored in order to reach it
  • Network traffic does not pass through the IDS unless it is mirrored
Passive means that the IDS monitors and reports on traffic. It does not take any action. This is the definition of operating in promiscuous mode.

The advantage of operating with a copy of the traffic is that the IDS does not negatively affect the packet flow of the forwarded traffic. The disadvantage of operating on a copy of the traffic is that the IDS cannot stop malicious single-packet attacks from reaching the target before responding to the attack. An IDS often requires assistance from other networking devices, such as routers and firewalls, to respond to an attack.

A better solution is to use a device that can immediately detect and stop an attack. An Intrusion Prevention System (IPS) performs this function.

Intrusion Prevention Systems

An IPS builds upon IDS technology. However, an IPS device operates in inline mode. This means that all incoming and outgoing traffic must flow through it for processing. As shown in the figure, an IPS does not allow packets to enter the trusted side of the network unless it has analyzed the packets. It can detect and immediately address a network problem.

An IPS monitors network traffic. It analyzes the contents and the payload of the packets for more sophisticated embedded attacks that might include malicious data. Some systems use a blend of detection technologies, including signature-based, profile-based, and protocol analysis-based intrusion detection. This deeper analysis enables the IPS to identify, stop, and block attacks that would pass through a traditional firewall device. When a packet comes in through an interface on an IPS, the outbound or trusted interface does not receive that packet until the IPS analyzes the packet.

The advantage of operating in inline mode is that the IPS can stop single-packet attacks from reaching the target system. The disadvantage is that a poorly configured IPS can negatively affect the packet flow of the forwarded traffic.

The biggest difference between IDS and IPS is that an IPS responds immediately and does not allow any malicious traffic to pass, whereas an IDS allows malicious traffic to pass before addressing the problem.


NetFlow and IPFIX

NetFlow is a Cisco IOS technology that provides statistics on packets flowing through a Cisco router or multilayer switch. NetFlow is the standard for collecting operational data from networks. The Internet Engineering Task Force (IETF) used Cisco’s NetFlow Version 9 as the basis for IP Flow Information Export (IPFIX).

IPFIX is a standard format for exporting router-based information about network traffic flows to data collection devices. IPFIX works on routers and management applications that support the protocol. Network managers can export network traffic information from a router and use this information to optimize network performance.

Applications that support IPFIX can display statistics from any router that supports the standard. Collecting, storing, and analyzing the aggregated information provided by IPFIX supported devices provides the following benefits:
  • Secures the network against internal and external threats
  • Troubleshoots network failures quickly and precisely
  • Analyzes network flows for capacity planning

Advanced Threat Intelligence

Advanced threat intelligence can help organizations detect attacks during one of the stages of the cyberattack and sometimes before with the right information.

Organizations may be able to detect indicators of attack in its logs and system reports for the following security alerts:
  • Account lockouts
  • All database events
  • Asset creation and deletion
  • Configuration modification to systems
Advanced threat intelligence is a type of event or profile data that can contribute to security monitoring and response. As the cyber criminals become more sophisticated, it is important to understand the malware maneuvers. With improved visibility into attack methodologies, an organization can respond more quickly to incidents.

Types of Disasters

It is critical to keep an organization functioning when a disaster occurs. A disaster includes any natural or human-caused event that damages assets or property and impairs the ability for the organization to continue operating.

Natural Disasters

Natural disasters differ depending on location. Some of these events are difficult to predict. Natural disasters fall into the following categories:
  • Geological disasters include earthquakes, landslides, volcanoes, and tsunamis
  • Meteorological disasters include hurricanes, tornadoes, snow storms, lightning, and hail
  • Health disasters include widespread illnesses, quarantines, and pandemics
  • Miscellaneous disasters include fires, floods, solar storms, and avalanches
Human-caused Disasters

Human-caused disasters involve people or organizations and fall into the following categories:
  • Labor events include strikes, walkouts, and slowdowns
  • Social-political events include vandalism, blockades, protests, sabotage, terrorism, and war
  • Materials events include hazardous spills and fires
  • Utilities disruptions include power failures, communication outages, fuel shortages, and radioactive fallout
Click here to see satellite photos of Japan before and after the 2011 Earthquake and Tsunami.

Disaster Recovery Plan

An organization puts its disaster recovery plan (DRP) into action while the disaster is ongoing and employees are scrambling to ensure critical systems are online. The DRP includes the activities the organization takes to assess, salvage, repair, and restore damaged facilities or assets.

To create the DRP, answer the following questions:
  • Who is responsible for this process?
  • What does the individual need to perform the process?
  • Where does the individual perform this process?
  • What is the process?
  • Why is the process critical?
A DRP needs to identify which processes in the organization are the most critical. During the recovery process, the organization restores its mission critical systems first.

Implementing Disaster Recovery Controls

Disaster recovery controls minimize the effects of a disaster to ensure that resources and business processes can resume operation.

There are three types of IT disaster recovery controls:
  • Preventative measures include controls that prevent a disaster from occurring. These measures seek to identify risks.
  • Detective measures include controls that discover unwanted events. These measures uncover new potential threats.
  • Corrective measures include controls that restore the system after a disaster or an event.

Need for Business Continuity

Business continuity is one of the most important concepts in computer security. Even though companies do whatever they can to prevent disasters and loss of data, it is impossible to predict every possible scenario. It is important for companies to have plans in place that ensure business continuity regardless of what may occur. A business continuity plan is a broader plan than a DRP because it includes getting critical systems to another location while repair of the original facility is under way. Personnel continue to perform all business processes in an alternate manner until normal operations resume.

Availability ensures that the resources required to keep the organization going will continue to be available to the personnel and the systems that rely on them.

Business Continuity Considerations

Business continuity controls are more than just backing up data and providing redundant hardware. Organizations need employees to properly configure and operate systems. Data can be useless until it provides information. An organization should look at the following:
  • Getting the right people to the right places
  • Documenting configurations
  • Establishing alternate communications channels for both voice and data
  • Providing power
  • Identifying all dependencies for applications and processes so that they are properly understood
  • Understanding how to carry out automated tasks manually

Business Continuity Best Practices

As shown in the figure, the National Institute of Standards and Technology (NIST) developed the following best practices:

  1. Write a policy that provided guidance to develop the business continuity plan and assigns roles to carry out the tasks.
  2. Identify critical systems and processes and prioritize them based on necessity.
  3. Identify vulnerabilities, threats, and calculate risks.
  4. Identify and implement controls and countermeasures to reduce risk.
  5. Devise methods to bring back critical systems quickly.
  6. Write procedures to keep the organization functioning in a chaotic state.
  7. Test the plan.
  8. Update the plan regularly.





Ref : [1]

Chapter 5: The Art of Ensuring Integrity

Integrity ensures that data remains unchanged and trustworthy by anyone or anything over its entire life cycle. Data integrity is a critical component to the design, implementation and usage of any system that stores, processes, or transmits data. This chapter begins by discussing the types of data integrity controls used such as hashing algorithms, salting, and keyed-hash message authentication code (HMAC). The use of digital signatures and certificates incorporates the data integrity controls to provide users a way of verifying the authenticity of messages and documents. The chapter concludes with a discussion of database integrity enforcement. Having a well-controlled and well-defined data integrity system increases the stability, performance, and maintainability of a database system.

What is Hashing?

Users need to know that their data remains unchanged while at rest or in transit. Hashing is a tool that ensures data integrity by taking binary data (the message) and producing a fixed-length representation called the hash value or message digest, as shown in the figure.

The hash tool uses a cryptographic hashing function to verify and ensure data integrity. It can also verify authentication. Hash functions replace clear text password or encryption keys because hash functions are one-way functions. This means that if a password is hashed with a specific hashing algorithm, it will always result in the same hash digest. It is considered one-way because with hash functions, it is computationally infeasible for two different sets of data to come up with the same hash digest or output.

Every time the data is changed or altered, the hash value also changes. Because of this, cryptographic hash values are often called digital fingerprints. They can detect duplicate data files, file version changes, and similar applications. These values guard against an accidental or intentional change to the data and accidental data corruption. Hashing is also very efficient. A large file or the content of an entire disk drive results in a hash value with the same size.


Hashing Properties

Hashing is a one-way mathematical function that is relatively easy to compute, but significantly harder to reverse. Grinding coffee is a good analogy of a one-way function. It is easy to grind coffee beans, but it is almost impossible to put all of the tiny pieces back together to rebuild the original beans.

A cryptographic hash function has the following properties:

  • The input can be any length.
  • The output has a fixed length.
  • The hash function is one way and is not reversible.
  • Two different input values will almost never result in the same hash values.

Hashing Algorithms

Hash functions are helpful to ensure that a user or communication error does not change the data accidentally. For instance, a sender may want to make sure that no one alters a message on its way to the recipient. The sending device inputs the message into a hashing algorithm and computes its fixed-length digest or fingerprint.

Simple Hash Algorithm (8-bit Checksum)

The 8-bit checksum is one of the first hashing algorithms, and it is the simplest form of a hash function. An 8-bit checksum calculates the hash by converting the message into binary numbers and then organizing the string of binary numbers into 8-bit chucks. The algorithm adds up the 8-bit values. The final step is to convert the result using a process called 2’s complement. The 2’s complement converts a binary to its opposite value, and then it adds one. This means that a zero converts to a one, and a one converts to a zero. The final step is to add 1 resulting in an 8-bit hash value.

Click here to calculate the 8-bit hash for the message BOB.

1. Convert BOB to binary using the ASCII code, as shown in Figure 1.

2. Convert the binary numbers to hexadecimal, as shown in Figure 2.

3. Enter the hexadecimal numbers into the calculator (42 4F 42).

4. Click the Calculate button. The result is the hash value 2D.

Try the following examples:

SECRET = “S”=53 “E”=45 “C”=43 “R”=52 “E”=45 “T”=54

HASH VALUE = 3A

MESSAGE = “M”=4D “E”=45 “S”=53 “S”=53 “A”=41 “G”=47 “E”=45

HASH VALUE = FB

Modern Hashing Algorithms

There are many modern hashing algorithms widely used today. Two of the most popular are MD5 and SHA.

Message Digest 5 (MD5) Algorithm

Ron Rivest developed the MD5 hashing algorithm, and several Internet applications use it today. MD5 is a one-way function that makes it easy to compute a hash from the given input data but makes it very difficult to compute input data given only a hash value.

MD5 produces a 128-bit hash value. The Flame malware compromised the security of MD5 in 2012. The authors of the Flame malware used an MD5 collision to forge a Windows code-signing certificate. Click here to read an explanation of the Flame malware collision attack.

Secure Hash Algorithm (SHA)

The U.S. National Institute of Standards and Technology (NIST) developed SHA, the algorithm specified in the Secure Hash Standard (SHS). NIST published SHA-1 in 1994. SHA-2 replaced SHA-1 with four additional hash functions to make up the SHA family:
  • SHA-224 (224 bit)
  • SHA-256 (256 bit)
  • SHA-384 (384 bit)
  • SHA-512 (512 bit)
SHA-2 is a stronger algorithm, and it is replacing MD5. SHA-256, SHA-384, and SHA-512 are the next-generation algorithms.

Hashing Files and Digital Media

Integrity ensures that data and information is complete and unaltered at the time of its acquisition. This is important to know when a user downloads a file from the Internet or a forensic examiner is looking for evidence on digital media.

To verify the integrity of all IOS images, Cisco provides MD5 and SHA checksums at Cisco’s Download Software website. The user can make a comparison of this MD5 digest against the MD5 digest of an IOS image installed on a device, as shown in the figure. The user can now feel confident that no one has tampered or modified the IOS image file.

Note: The command verify /md5, shown in the figure, is beyond the scope of this course.

The field of digital forensics uses hashing to verify all digital media that contain files. For example, the examiner creates a hash and a bit-for-bit copy of the media containing the files to produce a digital clone. The examiner compares the hash of the original media with the copy. If the two values match, the copies are identical. The fact that one set of bits is identical to the original set of bits establishes fixity. Fixity helps to answer several questions:
  • Does the examiner have the files he expects?
  • Is the data corrupted or changed?
  • Can the examiner prove that the files are not corrupt?
Now the forensics expert can examine the copy for any digital evidence while leaving the original intact and untouched.

Hashing Passwords

Hashing algorithms turn any amount of data into a fixed-length fingerprint or digital hash. A criminal cannot reverse a digital hash to discover the original input. If the input changes at all, it results in a different hash. This works for protecting passwords. A system needs to store a password in a form that protects it and can still verify that a user’s password is correct.

The figure shows the workflow for user account registration and authentication using a hash-based system. The system never writes the password to the hard drive, it only stores the digital hash.

Applications

Use cryptographic hash functions in the following situations:

To provide proof of authenticity when it is used with a symmetric secret authentication key, such as IP Security (IPsec) or routing protocol authentication
To provide authentication by generating one-time and one-way responses to challenges in authentication protocols
To provide message integrity check proof, such as those used in digitally signed contracts, and public key infrastructure (PKI) certificates, like those accepted when accessing a secure site using a browser
When choosing a hashing algorithm, use SHA-256 or higher as they are currently the most secure. Avoid SHA-1 and MD5 due to the discovery of security flaws. In production networks, implement SHA-256 or higher.

While hashing can detect accidental changes, it cannot guard against deliberate changes. There is no unique identifying information from the sender in the hashing procedure. This means that anyone can compute a hash for any data, as long as they have the correct hash function. For example, when a message traverses the network, a potential attacker could intercept the message, change it, recalculate the hash, and append the hash to the message. The receiving device will only validate against whatever hash is appended. Therefore, hashing is vulnerable to man-in-the-middle attacks and does not provide security to transmitted data.


Cracking Hashes

To crack a hash, an attacker must guess the password. The top two attacks used to guess passwords are dictionary and brute-force attacks.

A dictionary attack uses a file containing common words, phrases, and passwords. The file has the hashes calculated. A dictionary attack compares the hashes in the file with the password hashes. If a hash matches, the attacker will know a group of potentially good passwords.

A brute-force attack attempts every possible combination of characters up to a given length. A brute-force attack takes a lot of processor time, but it is just a matter of time before this method discovers the password. Passwords need to be long enough to make the time it takes to execute a brute-force attack too long to be worthwhile. Hashing passwords makes it more difficult for the criminal to retrieve those passwords.

Activity - Identify Hashing Terminology

What is Salting?

Salting makes password hashing more secure. If two users have the same password, they will also have the same password hashes. A salt, which is a random string of characters, is an additional input to the password before hashing. This creates a different hash result for the two passwords as shown in the figure. A database stores both the hash and the salt.

In the figure, the same password generates a different hash because the salt in each instance is different. The salt does not have to be secret since it is a random number.

Preventing Attacks

Salting prevents an attacker from using a dictionary attack to try to guess passwords. Salting also makes it impossible to use lookup tables and rainbow tables to crack a hash.

Lookup Tables

A lookup table stores the pre-computed hashes of passwords in a password dictionary along with the corresponding password. A lookup table is a data structure that processes hundreds of hash lookups per second. Click here to see how fast a lookup table can crack a hash.

Reverse Lookup Tables

This attack allows the cybercriminal to launch a dictionary or brute-force attack on many hashes without the pre-computed lookup table. The cybercriminal creates a lookup table that plots each password hash from the breached account database to a list of users. The cybercriminal hashes each password guess and uses the lookup table to get a list of users whose password matched the cybercriminal’s guess, as shown in the figure. Since many users have the same password, the attack works well.

Rainbow Tables

Rainbow tables sacrifice hash-cracking speed to make the lookup tables smaller. A smaller table means that the table can store the solutions to more hashes in the same amount of space.

Implementing Salting

A Cryptographically Secure Pseudo-Random Number Generator (CSPRNG) is the best choice to generate salt. CSPRNGs generate a random number that has a high level of randomness and is completely unpredictable, so it is cryptographically secure.

To implement salting successfully, use the following recommendations:
  • The salt needs to be unique for every user password.
  • Never reuse a salt.
  • The length of the salt should match the length of the hash function’s output.
  • Always hash on the server in a web application.
Using a technique called key stretching will also help to protect against attack. Key stretching makes the hash function very slowly. This prevents high-end hardware that can compute billions of hashes per second less effective.

The steps a database application uses to store and validate a salted password are shown in the figure.

What is an HMAC?

The next step in preventing a cybercriminal from launching a dictionary or brute-force attack on a hash is to add a secret key to the hash. Only the person who knows the hash can validate a password. One way to do this is to include the secret key in the hash using a hash algorithm called keyed-hash message authentication code (HMAC or KHMAC). HMACs use an additional secret key as input to the hash function. The use of HMAC goes a step further than just integrity assurance by adding authentication. An HMAC uses a specific algorithm that combines a cryptographic hash function with a secret key, as shown in the figure.

Only the sender and the receiver know the secret key, and the output of the hash function now depends on the input data and the secret key. Only parties who have access to that secret key can compute the digest of an HMAC function. This characteristic defeats man-in-the-middle attacks and provides authentication of the data origin.

HMAC Operation

Consider an example where a sender wants to ensure that a message remains unchanged in transit and wants to provide a way for the receiver to authenticate the origin of the message.

As shown in Figure 1, the sending device inputs data (such as Terry Smith’s pay of $100 and the secret key) into the hashing algorithm and calculates the fixed-length HMAC digest or fingerprint. The receiver gets the authenticated fingerprint attached to the message.

In Figure 2, the receiving device removes the fingerprint from the message and uses the plaintext message with its secret key as input to the same hashing function. If the receiving device calculates a fingerprint equal to the fingerprint sent, the message is still in its original form. Additionally, the receiver knows the origin of the message because only the sender possesses a copy of the shared secret key. The HMAC function proved the authenticity of the message.

Application of HMAC

HMACs can also authenticate a web user. Many web services use basic authentication, which does not encrypt the username and password during transmission. Using HMAC, the user sends a private key identifier and an HMAC. The server looks up the user’s private key and creates an HMAC. The user’s HMAC must match the one calculated by the server.

VPNs using IPsec rely on HMAC functions to authenticate the origin of every packet and provide data integrity checking.

As shown in the figure, Cisco products use hashing for entity authentication, data integrity, and data authenticity purposes:
  • Cisco IOS routers use hashing with secret keys in an HMAC-like manner to add authentication information to routing protocol updates.
  • IPsec gateways and clients use hashing algorithms, such as MD5 and SHA-1 in HMAC mode, to provide packet integrity and authenticity.
  • Cisco software images on Cisco.com have an MD5-based checksum available so that customers can check the integrity of downloaded images.
Note: The term entity can refer to devices or systems within an organization.

What is a Digital Signature?

Handwritten signatures and stamped seals prove authorship of the contents of a document. Digital signatures can provide the same functionality as handwritten signatures.

Unprotected digital documents are very easy for anyone to change. A digital signature can determine if someone edits a document after the user signs it. A digital signature is a mathematical method used to check the authenticity and integrity of a message, digital document, or software.

In many countries, digital signatures have the same legal importance as a manually signed document. Electronic signatures are binding for contracts, negotiations, or any other document requiring a handwritten signature. An audit trail tracks the electronic document’s history for regulatory and legal defense purposes.

A digital signature helps to establish authenticity, integrity, and non-repudiation. Digital signatures have specific properties that enable entity authentication and data integrity as shown in the figure.

Digital signatures are an alternative to HMAC.


Non-Repudiation

To repudiate means to deny. Non-repudiation is a way to ensure that the sender of a message or document cannot deny having sent the message or document and that the recipient cannot deny having received the message or document.

A digital signature ensures that the sender electronically signed the message or document. Since a digital signature is unique to the individual creating it, that person cannot later deny that he or she provided the signature.

Processes of Creating a Digital Signature

Asymmetric cryptography is the basis for digital signatures. A public key algorithm like RSA generates two keys: one private and the other public. The keys are mathematically related.

Alice wants to send Bob an email that contains important information for the rollout of a new product. Alice wants to make sure that Bob knows that the message came from her, and that the message did not change after she sent it.

Alice creates the message along with a digest of the message. She then encrypts this digest with her private key as shown in Figure 1. Alice bundles the message, the encrypted message digest, and her public key together to create the signed document. Alice sends this to Bob as shown in Figure 2.

Bob receives the message and reads it. To make sure that the message came from Alice, he creates a message digest of the message. He takes the encrypted message digest received from Alice and decrypts it using Alice’s public key. Bob compares the message digest received from Alice with the one he generated. If they match, Bob knows that he can trust that no one tampered with the message as shown in Figure 3.

Using Digital Signatures

Signing a hash instead of the whole document provides efficiency, compatibility, and integrity. Organizations may want to replace paper documents and ink signatures with a solution that assures the electronic document meets all legal requirements.

The following two situations provide examples of using digital signatures:

  • Code signing - Used to verify the integrity of executable files downloaded from a vendor website. Code signing also uses signed digital certificates to authenticate and verify the identity of the site (Figure 1).
  • Digital certificates - Used to verify the identity of an organization or individual to authenticate a vendor website and establish an encrypted connection to exchange confidential data (Figure 2).

Comparing Digital Signature Algorithms

The three common digital signature algorithms are Digital Signature Algorithm (DSA), Rivest-Shamir-Adleman (RSA), and Elliptic Curve Digital Signature Algorithm (ECDSA). All three generate and verify digital signatures. These algorithms depend upon asymmetrical encryption and public key techniques. Digital signatures require two operations:

1. Key generation

2. Key verification

Both operations require key encryption and decryption.

DSA uses large number factorization. Governments use DSA for signing to create digital signatures. DSA does not extend beyond the signature to the message itself.

RSA is the most common public key cryptography algorithm in use today. RSA is named after the individuals who created it in 1977: Ron Rivest, Adi Shamir, and Leonard Adleman. RSA depends on asymmetrical encryption. RSA covers signing and also encrypts the content of the message.

DSA is faster than RSA as a signing services for a digital document. RSA is best suited for applications requiring the signing and verification of electronic documents and message encryption.

Like most areas of cryptography, the RSA algorithm is based on two mathematical principles; modulus and prime number factorization. Click here to learn more about how RSA uses both modulus and prime number factorization.

ECDSA is the newest digital signature algorithm that is gradually replacing RSA. The advantage of this new algorithm is that it can uses much smaller key sizes for the same security and requires less computation than RSA.

What is a Digital Certificate?

A digital certificate is equivalent to an electronic passport. They enable users, hosts, and organizations to exchange information securely over the Internet. Specifically, a digital certificate authenticates and verifies that users sending a message are who they claim to be. Digital certificates can also provide confidentiality for the receiver with the means to encrypt a reply.

Digital certificates are similar to physical certificates. For example, the paper-based Cisco Certified Network Associate Security (CCNA-S) certificate in Figure 1 identifies the individual, the Certificate Authority (who authorized the certificate), and for how long the certificate is valid. Notice how the digital certificate in Figure 2 also identifies similar elements.


Using Digital Certificates

Refer to the figure to help understand how digital certificates are used to authenticate a person or entity. In this scenario, Bob is purchasing an online item from Alice's website. Bob's browser will use Alice's digital certificate to ensure he is actually shopping on Alice's website and to secure traffic between them.

Step 1: Bob decides to buy something on Alice's website and clicks on "Proceed to Checkout". His browser initiates a secure connection with Alice's web server and displays a lock icon in the security status bar.

Step 2: Alice's web server receives the request and replies by sending its digital certificate containing the web server public key and other information to Bob's browser.

Step 3: Bob's browser checks the digital certificate against stored certificates and confirms that he is indeed connected to Alice's web server. Only trusted certificates permit the transaction to go forward. If the certificate is not valid, then communication fails.

Step 4: Bob's web browser creates a unique session key that will be used for secure communication with Alice's web server.

Step 5: Bob's browser encrypts the session key using Alice's public key and sends it to Alice's web server.

Step 6: Alice's web server receives the encrypted message from Bob's browser. It uses its private key to decrypt the message and discover the session key. Future exchange between the web server and browser will now use the session key to encrypt data.

What is a Certificate Authority?

On the Internet, continually exchanging identification between all parties would be impractical. Therefore, individuals agree to accept the word of a neutral third party. Presumably, the third party does an in-depth investigation prior to the issuance of credentials. After this in-depth investigation, the third party issues credentials that are difficult to forge. From that point forward, all individuals who trust the third party simply accept the credentials that the third party issues.

For example, in the figure Alice applies for a driver’s license. In this process, she submits evidence of her identity, such as birth certificate, picture ID, and more to a government-licensing bureau. The bureau validates Alice’s identity and permits Alice to complete a driver’s examination. Upon successful completion, the licensing bureau issues Alice a driver license. Later, Alice needs to cash a check at the bank. Upon presenting the check to the bank teller, the bank teller asks her for ID. The bank, because it trusts the government-licensing bureau, verifies her identity and cashes the check.

A certificate authority (CA) functions the same as the licensing bureau. The CA issues digital certificates that authenticate the identity of organizations and users. These certificates also sign messages to ensure that no one tampered with the messages.

What is Inside a Digital Certificate?

As long as a digital certificate follows a standard structure, any entity can read and understand it regardless of the issuer. X.509 is a standard for a public key infrastructure (PKI) to manage digital certificates. PKI is the policies, roles, and procedures required to create, manage, distribute, use, store, and revoke digital certificates. The X.509 standard specifies that digital certificates contain the standard information shown in the figure.

The Validation Process

Browsers and applications perform a validation check before they trust a certificate to ensure they are valid. The three processes include the following:
  • Certificate Discovery validates the certification path by checking each certificate starting at the beginning with the root CA’s certificate
  • Path Validation selects a certificate of the issuing CA for each certificate in the chain
  • Revocation determines whether the certificate was revoked and why

The Certificate Path

An individual gets a certificate for a public key from a commercial CA. The certificate belongs to a chain of certificates called the chain of trust. The number of certificates in the chain depends on the hierarchical structure of the CA.

The figure shows a certificate chain for a two tier CA. There is an offline Root CA and an online subordinate CA. The reason for the two-tier structure is that X.509 signing allows for easier recovery in the event of a compromise. If there is an offline CA, it can sign the new online CA certificate. If there is not an offline CA, a user has to install a new root CA certificate on every client machine, phone, or tablet.

Activity - Order the Steps in the Digital Certificate Process


Data Integrity

Databases provide an efficient way to store, retrieve, and analyze data. As data collection increases and data becomes more sensitive, it is important for cybersecurity professionals to protect the growing number of databases. Think of a database as an electronic filing system. Data integrity refers to the accuracy, consistency, and reliability of data stored in a database. The responsibility of data integrity falls on database designers, developers, and the organization’s management.

The four data integrity rules or constraints are as follows:
  • Entity Integrity: All rows must have a unique identifier called a Primary Key (Figure 1).
  • Domain Integrity: All data stored in a column must follow the same format and definition (Figure 2).
  • Referential Integrity: Table relationships must remain consistent. Therefore, a user cannot delete a record which is related to another one (Figure 3).
  • User-defined Integrity: A set of rules defined by a user which does not belong to one of the other categories. For example, a customer places a new order, as shown in Figure 4. The user first checks to see if this is a new customer. If it is, the user adds the new customer to the customers table.

Data Entry Controls

Data entry involves inputting data to a system. A set of controls ensures that users enter the correct data.

Drop Down Master Data Controls

Have a drop down option for master tables instead of asking individuals to enter the data. An example of drop down master data controls is using the locations list from the U.S. postal address system to standardize addresses.

Data Field Validation Controls

Set up rules for basic checks including:
  • Mandatory input ensures that a required field contains data
  • Input masks prevent users from entering invalid data or help ensure that they enter data consistently (like a phone number, for example)
  • Positive dollar amounts
  • Data ranges ensure that a user enters data within a given range (like a date of birth entered as 01-18-1820, for example)
  • Mandatory second person approval (a bank teller receives a deposit or withdraw request greater than a specified value triggers a second or third approval)
  • Maximum record modification trigger (the number of records modified exceeds a predetermined number within a specific period of time blocks a user until a manager identifies whether or not the transactions were legitimate)
  • Unusual activity trigger (a system locks when it recognizes unusual activity)

Validation Rules

A validation rule checks that data falls within the parameters defined by the database designer. A validation rule helps to ensure the completeness, accuracy and consistency of data. The criteria used in a validation rule include the following:
  • Size – checks the number of characters in a data item
  • Format – checks that the data conforms to a specified format
  • Consistency – checks for the consistency of codes in related data items
  • Range – checks that data lies within a minimum and maximum value
  • Check digit – provides for an extra calculation to generate a check digit for error detection


Data Type Validation

Data type validation is the simplest data validation and verifies that a user entering data is consistent with the type of characters expected. For example, a phone number would not contain alpha characters. Databases allow three data types: integer, string, and decimal.

Input Validation

One of the most vulnerable aspects of database integrity management is controlling the data input process. Many well-known attacks run against a database and insert malformed data. The attack can confuse, crash, or make the application divulge too much information to the attacker. Attackers use automated input attacks.

For example, users fill out a form via a Web application to subscribe to a newsletter. A database application automatically generates and sends email confirmations. When users receive their email confirmations with a URL link to confirm their subscription, attackers modify the URL link. These modifications include changing the username, email address, or subscription status. The email returns back to the server hosting the application. If the Web server did not verify that the email address or other account information submitted matched the subscription information, the server received bogus information. Hackers can automate the attack to flood the Web application with thousands of invalid subscribers to the newsletter database.

Anomaly Verification

Anomaly detection refers to identifying patterns in data that do not conform to expected behavior. These non-conforming patterns are anomalies, outliers, exceptions, aberrations, or surprises in different database applications. Anomaly detection and verification is an important countermeasure or safeguard in identifying fraud detection. Database anomaly detection can identify credit card and insurance fraud. Database anomaly detection can protect data from massive destruction or changes.

Anomaly verification requires verification data requests or modifications when a system detects unusual or surprising patterns. An example of this is a credit card with two transactions in vastly different request locations in a short time. If a transaction request from New York City occurs at 10:30 a.m. and a second request comes from Chicago at 10:35 a.m., the system triggers a verification of the second transaction.

A second example occurs when an unusual number of email address modifications happen in an unusual number of database records. Since email data launch DoS attacks, the email modification of hundreds of records could indicate that an attacker is using an organization’s database as a tool for his or her DoS attack.

Activity - Identify the Database Integrity Controls


Entity Integrity

A database is like an electronic filing system. Maintaining proper filing is critical in maintaining the trustworthiness and usefulness of the data within the database. Tables, records, fields, and data within each field make up a database. In order to maintain the integrity of the database filing system, users must follow certain rules. Entity integrity is an integrity rule, which states that every table must have a primary key and that the column or columns chosen to be the primary key must be unique and not NULL. Null in a database signifies missing or unknown values. Entity integrity enables proper organization of data for that record as shown in the figure.

Referential Integrity

Another important concept is the relationship between different filing systems or tables. The basis of referential integrity is foreign keys. A foreign key in one table references a primary key in a second table. The primary key for a table uniquely identifies entities (rows) in the table. Referential integrity maintains the integrity of foreign keys.

Domain Integrity

Domain integrity ensures that all the data items in a column fall within a defined set of valid values. Each column in a table has a defined set of values, such as the set of all numbers for credit card numbers, social security numbers, or email addresses. Limiting the value assigned to an instance of that column (an attribute) enforces domain integrity. Domain integrity enforcement can be as simple as choosing the correct data type, length and or format for a column.



Ref : [1]