Fun with technology: November 2012

Thursday, November 29, 2012

Challenges in Agile Testing

Every software development lifecycle model, from sequential to spiral to incremental to Agile, has testing implications. Some of these implications ease the testing process and some of these implications challenge testing.

A blended testing strategy consists of three types of test strategies:

• Analytical risk-based testing;

• Automated regression testing;

• Reactive testing (also referred to as dynamic

testing).

The blended testing strategy aligns well with Scrum and other agile methodologies.

In some cases, these strategies mitigate the testing risks and reduce the testing challenges associated with these methodologies.

However, it does not resolve all of the risks and challenges, few of the challenges faced and the best approach to mitigate them have been discussed below:

Dealing with the Volume and Speed of Change

“Welcome changing requirements” is considered as one of the principles of agile development,

Many testing strategies, especially analytical requirements-based testing, become quite inefficient in such situations.

However, risk-based testing accommodates change, since we can always add risks, remove risks, change risks, and adjust the level of risk.If test execution is underway, we can adjust our plan for the remaining period based on this new view of quality risk.

The reactive testing, not requiring much documentation, is also quite resilient in the face of change.

However, challenges arising from change in the definition of the product and its correct behavior, when the test team is not kept informed of these changes, or when the rate of change is very high, can impose inefficiencies on the development, execution, and maintenance of tests.

Remaining Effective during Very Short Iterations

Agile methodologies like Scrum are less formal and faster moving “sprint”’. These methodologies use short, fast-paced iterations, further squeezing the test

team’s ability to develop and maintain test systems, compounding the effects of change.

Testing strategies that include an automation element have proven particularly/sensitive to this challenge.

The risk-based element of the recommended strategy can help.

Risk-based testing focuses on the important areas of test coverage, and de-emphasizes or even cuts less important areas, relieving some of the pressure created by the short iterations. This ability to focus proves especially helpful for test teams also under tight resources constraints. Test teams in an agile world should develop, maintain, and execute tests in risk priority order.

Using risk priority to sequence development and maintenance efforts allows the test team to have the most important tests ready at the beginning of each sprint’s test execution period.

Receiving Code after Inconsistent and Often Inadequate Unit Testing

The short test execution periods on Agile sprints, compared to sequential projects, means that the degree of damage caused by one or two days of test progress blockage due to highly buggy code is higher than in a sequential project.

Delivery of unstable, buggy code will undermine one of the key benefits of the risk-based

testing portion of the recommended strategy, which is the discovery of the most important

defects early in the test execution period.

It also inevitably leads to a large degree of code churn during testing, since so much must change to fix the bugs.

The amount of change can ultimately outstrip even the ability of the best automated regression test system to keep up, which would then lead to lower defect detection effectiveness for the test team.

Managing the Increased Regression Risk

Agile methodology advocates emphasize good automated unit testing

in part to manage the regression risk inherent in such churn.

However, good unit testing has limited defect removal effectiveness.

Therefore, we need effective regression testing at the system test level (which has a higher level of defect detection effectiveness).

By combining risk-based testing with the automated regression testing, test teams can effectively manage the increased regression risk.

Making Do with Poor, Changing, and Missing Test Oracles

Agile methodologies de-value written documentation. Special scorn is reserved for specifications.

For example, the Agile Manifesto suggests people should value “working software over comprehensive documentation.”

This creates real challenges for a test team.Testers use requirements specifications and other documents as test oracles; i.e., as the means to determine correct behavior under a given test condition.

Testers in Agile situations are often given documents with insufficient detail, or, in some cases, given no such documents at all.No known test strategy can resolve this challenge. Resolving the challenge requires change management.

Further, the inability to determine precisely whether a test failed affects both the efficiency of the testing and the defect detection effectiveness.

When testers spend time isolating situations that the project team ultimately chooses to define as correct behavior, that takes away time they could have spent finding and reporting real bugs.

These bugs create subsequent problems customers, users, and technical support staff, and distractions for developers and test teams.

Further, the situation creates frustration for the testers that reduces their morale and, consequently, their effectiveness.

Testers want to produce valid information. When much of the information they produce – in the form of rejected bugs reports – ends up in the figurative wastebasket of the project that tends to make people wonder why they bother.

It’s important to realize that this reduction in test effectiveness, efficiency, and morale is a

potential side-effect of agile methodologies. Bad problems can get much worse when the test team is held accountable for outcomes beyond their control.

Holding to Arbitrary Sprint Durations

Some of our clients following agile methodologies like Scrum tend to ritualize some of

the rules of the process. The time deadlines for sprints seem particularly subject to this ritualization.

So, on the last Friday of the sprint, with development ending late for the sprint, the arbitrary deadline remains intact at the expense of the test team’s weekend.

Fully resolving this challenge requires team and management maturity. When people habitually and systematically over-commit,

If fixing the software estimation problems cannot occur, risk-based testing helps the test team deal with systematic and constant over-commitment.

To start with, when the test team is time-crunched over and over again at sprint’s end, the test team should accept that the project team’s priorities are schedule-driven, not quality-driven.

The test team should revise the risk analysis approach to institute an across the-

Board reduction in the extent of testing assigned to each quality risk items during risk

analysis for subsequent projects.

That way, at least the test team won’t over-commit.

In some cases, in spite of reducing the scope of testing, the test team still can’t execute all the tests in the available time at the end of a sprint. If so, rather than ruining their weekend to run every test, the test team can select the most important tests using the risk priority number.

Less important tests can slip into the next sprint.

Conclusion

The test strategies recommended support the stated goals of Agile methodologies.

Risk-based testing supports increased quality, since it focuses testing on high-risk

areas where testing can significantly reduce the risk. Risk-based testing supports increased productivity, since it reduces or eliminates testing where the quality risk is lower. Risk based testing supports flexibility, since it allows regular revision of the quality risk items which re-aligns the remaining testing with the new risks and their new levels of risk.

Automated regression testing helps to contain the regression risks associated with Agile methodologies, allowing a higher rate of change.

Reactive testing allows testers to explore various aspects of the system that risk-based testing and automated regression testing together might miss.

Friday, November 23, 2012

Use Perfmon to monitor servers and find bottlenecks

What and When to Measure

Bottlenecks occur when a resource reaches its capacity, causing the performance of the entire system to slow down. Bottlenecks are typically caused by insufficient or misconfigured resources, malfunctioning components, and incorrect requests for resources by a program.

There are five major resource areas that can cause bottlenecks and affect server performance: physical disk, memory, process, CPU, and network. If any of these resources are overutilized, your server or application can become noticeably slow or can even crash. I will go through each of these five areas, giving guidance on the counters you should be using and offering suggested thresholds to measure the pulse of your servers.

Since the sampling interval has a significant impact on the size of the log file and the server load, you should set the sample interval based on the average elapsed time for the issue to occur so you can establish a baseline before the issue occurs again. This will allow you to spot any trend leading to the issue.

Fifteen minutes will provide a good window for establishing a baseline during normal operations. Set the sample interval to 15 seconds if the average elapsed time for the issue to occur is about four hours. If the time for the issue to occur is eight hours or more, set the sampling interval to no less than five minutes; otherwise, you will end up with a very large log file, making it more difficult to analyze the data.

Hard Disk Bottleneck

Since the disk system stores and handles programs and data on the server, a bottleneck affecting disk usage and speed will have a big impact on the server’s overall performance.

Please note that if the disk objects have not been enabled on your server, you need to use the command-line tool Diskperf to enable them. Also, note that % Disk Time can exceed 100 percent and, therefore, I prefer to use % Idle Time, Avg. Disk sec/Read, and Avg. Disk sec/write to give me a more accurate picture of how busy the hard disk is. You can find more on % Disk Time in the Knowledge Base article available at support.microsoft.com/kb/310067.

Following are the counters the Microsoft Service Support engineers rely on for disk monitoring.

LogicalDisk\% Free Space This measures the percentage of free space on the selected logical disk drive. Take note if this falls below 15 percent, as you risk running out of free space for the OS to store critical files. One obvious solution here is to add more disk space.

PhysicalDisk\% Idle Time This measures the percentage of time the disk was idle during the sample interval. If this counter falls below 20 percent, the disk system is saturated. You may consider replacing the current disk system with a faster disk system.

PhysicalDisk\Avg. Disk Sec/Read This measures the average time, in seconds, to read data from the disk. If the number is larger than 25 milliseconds (ms), that means the disk system is experiencing latency when reading from the disk. For mission-critical servers hosting SQL Server® and Exchange Server, the acceptable threshold is much lower, approximately 10 ms. The most logical solution here is to replace the current disk system with a faster disk system.

PhysicalDisk\Avg. Disk Sec/Write This measures the average time, in seconds, it takes to write data to the disk. If the number is larger than 25 ms, the disk system experiences latency when writing to the disk. For mission-critical servers hosting SQL Server and Exchange Server, the acceptable threshold is much lower, approximately 10 ms. The likely solution here is to replace the disk system with a faster disk system.

PhysicalDisk\Avg. Disk Queue Length This indicates how many I/O operations are waiting for the hard drive to become available. If the value here is larger than the two times the number of spindles, that means the disk itself may be the bottleneck.

Memory\Cache Bytes This indicates the amount of memory being used for the file system cache. There may be a disk bottleneck if this value is greater than 300MB.

Memory Bottleneck

A memory shortage is typically due to insufficient RAM, a memory leak, or a memory switch placed inside the boot.ini. Before I get into memory counters, I should discuss the /3GB switch.

More memory reduces disk I/O activity and, in turn, improves application performance. The /3GB switch was introduced in Windows NT® as a way to provide more memory for the user-mode programs.

Windows uses a virtual address space of 4GB (independent of how much physical RAM the system has). By default, the lower 2GB are reserved for user-mode programs and the upper 2GB are reserved for kernel-mode programs. With the /3GB switch, 3GB are given to user-mode processes. This, of course, comes at the expense of the kernel memory, which will have only 1GB of virtual address space. This can cause problems because Pool Non-Paged Bytes, Pool Paged Bytes, Free System Page Tables Entries, and desktop heap are all squeezed together within this 1GB space. Therefore, the /3GB switch should only be used after thorough testing has been done in your environment.

This is a consideration if you suspect you are experiencing a memory-related bottleneck. If the /3GB switch is not the cause of the problems, you can use these counters for diagnosing a potential memory bottleneck.

Memory\% Committed Bytes in Use This measures the ratio of Committed Bytes to the Commit Limit—in other words, the amount of virtual memory in use. This indicates insufficient memory if the number is greater than 80 percent. The obvious solution for this is to add more memory.

Memory\Available Mbytes This measures the amount of physical memory, in megabytes, available for running processes. If this value is less than 5 percent of the total physical RAM, that means there is insufficient memory, and that can increase paging activity. To resolve this problem, you should simply add more memory.

Memory\Free System Page Table Entries This indicates the number of page table entries not currently in use by the system. If the number is less than 5,000, there may well be a memory leak.

Memory\Pool Non-Paged Bytes This measures the size, in bytes, of the non-paged pool. This is an area of system memory for objects that cannot be written to disk but instead must remain in physical memory as long as they are allocated. There is a possible memory leak if the value is greater than 175MB (or 100MB with the /3GB switch). A typical Event ID 2019 is recorded in the system event log.

Memory\Pool Paged Bytes This measures the size, in bytes, of the paged pool. This is an area of system memory used for objects that can be written to disk when they are not being used. There may be a memory leak if this value is greater than 250MB (or 170MB with the /3GB switch). A typical Event ID 2020 is recorded in the system event log.

Memory\Pages per Second This measures the rate at which pages are read from or written to disk to resolve hard page faults. If the value is greater than 1,000, as a result of excessive paging, there may be a memory leak.

Processor Bottleneck

An overwhelmed processor can be due to the processor itself not offering enough power or it can be due to an inefficient application. You must double-check whether the processor spends a lot of time in paging as a result of insufficient physical memory. When investigating a potential processor bottleneck, the Microsoft Service Support engineers use the following counters.

Processor\% Processor Time This measures the percentage of elapsed time the processor spends executing a non-idle thread. If the percentage is greater than 85 percent, the processor is overwhelmed and the server may require a faster processor.

Processor\% User Time This measures the percentage of elapsed time the processor spends in user mode. If this value is high, the server is busy with the application. One possible solution here is to optimize the application that is using up the processor resources.

Processor\% Interrupt Time This measures the time the processor spends receiving and servicing hardware interruptions during specific sample intervals. This counter indicates a possible hardware issue if the value is greater than 15 percent.

System\Processor Queue Length This indicates the number of threads in the processor queue. The server doesn’t have enough processor power if the value is more than two times the number of CPUs for an extended period of time.

Network Bottleneck

A network bottleneck, of course, affects the server’s ability to send and receive data across the network. It can be an issue with the network card on the server, or perhaps the network is saturated and needs to be segmented. You can use the following counters to diagnosis potential network bottlenecks.

Network Interface\Bytes Total/Sec This measures the rate at which bytes are sent and received over each network adapter, including framing characters. The network is saturated if you discover that more than 70 percent of the interface is consumed. For a 100-Mbps NIC, the interface consumed is 8.7MB/sec (100Mbps = 100000kbps = 12.5MB/sec* 70 percent). In a situation like this, you may want to add a faster network card or segment the network.

Network Interface\Output Queue Length This measures the length of the output packet queue, in packets. There is network saturation if the value is more than 2. You can address this problem by adding a faster network card or segmenting the network.

Process Bottleneck

Server performance will be significantly affected if you have a misbehaving process or non-optimized processes. Thread and handle leaks will eventually bring down a server, and excessive processor usage will bring a server to a crawl. The following counters are indispensable when diagnosing process-related bottlenecks.

Process\Handle Count This measures the total number of handles that are currently open by a process. This counter indicates a possible handle leak if the number is greater than 10,000.

Process\Thread Count This measures the number of threads currently active in a process. There may be a thread leak if this number is more than 500 between the minimum and maximum number of threads.

Process\Private Bytes This indicates the amount of memory that this process has allocated that cannot be shared with other processes. If the value is greater than 250 between the minimum and maximum number of threads, there may be a memory leak.

Wrapping Up

Now you know what counters the Service Support engineers at Microsoft use to diagnose various bottlenecks. Of course, you will most likely come up with your own set of favorite counters tailored to suit your specific needs. You may want to save time by not having to add all your favorite counters manually each time you need to monitor your servers. Fortunately, there is an option in the Performance Monitor that allows you to save all your counters in a template for later use.

You may still be wondering whether you should run Performance Monitor locally or remotely. And exactly what will the performance hit be when running Performance Monitor locally? This all depends on your specific environment. The performance hit on the server is almost negligible if you set intervals to at least five minutes.

You may want to run Performance Monitor locally if you know there is a performance issue on the server, since Performance Monitor may not be able to capture data from a remote machine when it is running out of resources on the server. Running it remotely from a central machine is really best suited to situations when you want to monitor or baseline multiple servers.

Wednesday, November 7, 2012

Security testing

Security Testing

First thing is first, to start testing for Security bugs we need to first know what to find. What are we looking for? Any ideas? :?:

Yes? The girl in the back with the blue shirt, what's that you say? A work-list? That's correct! Thanks.

So we need a work-list - a method - something generic that will cover all exits on any pattern. How can we come up with something like that?! Oh, that's where I come in :lol:

When ever we like to test for security we need to focus on the following list:
• Input Validation - attacks like - XSS and SQL injection
• Authentication - DoS, Brute force attack and Spamming (attacks on forms)
• Session & Cookie management - account hijacking by editing or stealing the session with an attack called Man In The Middle (MITM)
• Authorization - attacks that will gain you access to other accounts or admin privileges
• Error handling - Information Disclosure of server's version and other component such as database version
• Coding - text files or JS (javascript) containing juicy information (paths to admin panel for example), Coding remarks may also be very handy
• Network Configurations - open ports that can gain an attacker a nice back-door to the server (21:FTP, 22:SSH, 23:Telnet, 1433:MsSQL, 3306:MySQL etc.). Also, forbidden HTTP methods such as PUT, DELETE, TRACE can give a big advantage to the attacker.

----It is better that you copy and paste this list----

So, after we've come up with a good and solid list, we need to understand how to test. Let's say we've asked to test http://www.somedomain.com/

Oh, almost forgot - every one of the sections above (AKA our worklist) is attached to the other. You'll see :geek:

The first section is Input Validation.
Read - Cross-Site Scripting (XSS) and SQL injection from my 'Becoming a hacker' topic, to learn the basics of testing input validations.

Testing input validation can lead you very quickly to discover the site's Error handling, if one of your injections haven't sanitized correctly the server will probably redirect you to it's error page. If the site's admin haven't implemented a nice redirection to an error like 'Sorry, but the page you were looking for is not here', the server's error will
pop displaying an information about it... some time's it can lead you to a new attack :mrgreen:

Authentication, Session & Cookie management, Authorization are all a part of using the right tool.

Authentication - The easiest way to discover rather a form is vulnerable to Brute Force is to look for a Captcha implementation. A Captcha is a well known feature that monitors the form for human usage only. All we're talking about is an image with numbers in it and a field for the user to copy those numbers.. the problem is that their is no automated tool yet to read those images and copy their content to the field. That way an admin can be sure that automated tools would have hard time to hijack accounts.
A nice tool for a PoC (Proof of Concept) will be the Burp Proxy.
The free license is enough.
Go to those links after you've successfully downloaded the proxy, and used my examples. See ya soon.
1. Burp/Session 1
2. Burp/Session 2
3. Burp/Session 3

Now that you are Burp freaks... we'll go back to our worklist.
Next is Session & Cookie management, Well I'm not gonna teach it because it takes time
and experience. After you know how to use the burp, and where the Cookie: header is, all you need to do is read about Session Fixation and you'd understand everything.
If you really want to simulate a hijack, take my example for Click Jacking. Enjoy 8-)

Authorization - Let's say we're testing on http://www.something.com/
Now after entering the site there is an immediate redirection to http://www.something.com/Login.php.
uTest will probably give you a user and pass and than all you need to do is login.
After logging in you noticed this URL: something.com/welcome.php?uid=1203&mode=post&cid=e45fdsv4543rrfd
How could it be vulnerable?
Look again... what is the first parameter? that's correct - uid --> user id
means your user id is 1203. So basically there is a list in the database and you are number 1203. Does it necessarily means that there are more than 1200 users.. of course not. This number can be a part of an automated sequence.
If the Session management is misconfigured, authorization can be manipulated using this parameter.. all you need to do is switch the numbers. Yes you can also try 1203'+or+'1'='1 or 1203">....
You see... all strings attached.
Another attack can occur using Coding section on our list.
Javascript files can hold this record -

Code: Select all: path = url + /admin/panel/index.php

Only copy and paste it and maybe you will achieve authorization bypass. Even if not full.. a partial also helps.

Network Configurations - in order to find the open ports you need to install Nmap.
Nmap ("Network Mapper") is a free and open source (license) utility for network discovery and security auditing. Many systems and network administrators also find it useful for tasks such as network inventory, managing service upgrade schedules, and monitoring host or service uptime. Nmap uses raw IP packets in novel ways to determine what hosts are available on the network, what services (application name and version) those hosts are offering, what operating systems (and OS versions) they are running, what type of packet filters/firewalls are in use, and dozens of other characteristics.

To test HTTP methods all you need to do is first replace GET/POST with OPTIONS and than you'll get the methods available.
one of the dangerous ones is PUT, and it has two main kinds of uses.
1. Active/diactive components - like Refresh
2. Create new files or Directories on the server - RISK

If an attacker can use the PUT method she can create a file with a malicious admin panel called Shell. This shell can operate a lot of dangerous features such as malicious scripts, SQL queries, open ports for back-doors, DoS (denial of service) scripts and more.
DELETE method has also the same pattern only instead of create it deletes the file/directory. I think you guessed the risk.. 8-)

TRACE method can cause - Cross-Site Tracing (XST).
and more...

Well there you go... you're all set for your first Security test.
Wish you best of luck, and may the force be with you.