<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Webtrends Optimization &#187; test design</title>
	<atom:link href="http://blogs.webtrends.com/optimization/tag/test-design/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.webtrends.com/optimization</link>
	<description>Just another Webtrends Blogs weblog</description>
	<lastBuildDate>Tue, 12 Jan 2010 15:00:42 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Rules for a successful multivariate test (Billy’s Optimization Guide Part 3)</title>
		<link>http://blogs.webtrends.com/optimization/2009/06/16/rules-successful-multivariate-test-billys-optimization-guide-part-3/</link>
		<comments>http://blogs.webtrends.com/optimization/2009/06/16/rules-successful-multivariate-test-billys-optimization-guide-part-3/#comments</comments>
		<pubDate>Wed, 17 Jun 2009 00:15:20 +0000</pubDate>
		<dc:creator>Billy Shih</dc:creator>
				<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Testing Concerns]]></category>
		<category><![CDATA[Testing Techniques]]></category>
		<category><![CDATA[Billy's Optimization Guide]]></category>
		<category><![CDATA[fractional factorial]]></category>
		<category><![CDATA[full factorial]]></category>
		<category><![CDATA[multivariate testing]]></category>
		<category><![CDATA[stabilization]]></category>
		<category><![CDATA[test design]]></category>
		<category><![CDATA[test type]]></category>

		<guid isPermaLink="false">http://testingblog.widemile.com/?p=365</guid>
		<description><![CDATA[
If you missed it, see Part 1 (A/B Split Testing) and Part 2 (Multivariate Test Basics).
With the basics of part 2 down, it&#8217;s time to start designing a multivariate test.  Every optimization project has different challenges and goals, luckily though, there are a few rules that apply to every multivariate test design.  These rules fit [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center"><img class="size-full wp-image-426 aligncenter" src="http://blogs.webtrends.com/optimization/files/2009/06/rules.jpg" alt="Rules of Six Detail" width="450" height="296" /></p>
<p><em>If you missed it, see <a href="../2009/01/26/pro-a-b-split-test-method/">Part 1 (A/B Split Testing)</a> and <a href="http://testingblog.widemile.com/2009/01/29/simplifying-multivariate-testing-fo-billys-optimization-guide-part-2/">Part 2 (Multivariate Test Basics)</a>.</em></p>
<p>With the basics of part 2 down, it&#8217;s time to start designing a multivariate test.  Every optimization project has different challenges and goals, luckily though, there are a few rules that apply to every multivariate test design.  These rules fit into two categories: technical rules and content rules.</p>
<p><strong>Technical rules:</strong></p>
<ol>
<li>Choose the appropriate multivariate test type (<a href="http://testingblog.widemile.com/2008/07/24/primer-full-and-fractional-factorial-test-design/">full or fractional factorial</a>)</li>
<li>Determine the number of factors and levels that can be tested based on estimated conversion traffic (choose a test array)</li>
<li>Stop the test when it has stabilized, not based on your earlier estimations</li>
</ol>
<p>These rules ensure statistical significance by constraining the test to the appropriate size at the beginning and then letting the test gather the proper amount of data at the end.</p>
<p>Running a test full factorial, if your traffic supports it, may be a good choice if you&#8217;re testing content that you believe to have many interactions or if you only want to test 2 factors with 2 levels each.  (Note: the smallest fractional factorial test size is 3 factors with 2 levels each.)  Typically though, you&#8217;ll want to run a fractional factorial test to save time and expand the number of factors and levels you can test.</p>
<p>In order to find out how many factors and levels you can test, you need to have some idea of your predicted page views, conversions, as well as an estimate of lift.  The reason that lift matters, is that a large lift will get you more conversions and so your test will stabilize quicker.  Because of this, I would be conservative with lift estimates to ensure that the test is not designed too large.  At Widemile, we have a large list of arrays available to our tool and have calculated the approximate conversions needed to stabilize, allowing me to look at the three criteria I listed and find the arrays that are statistically viable for testing.  You should look for something similar with your tool of choice.</p>
<p>To figure out when a test is stabilized, I prefer to primarily look at level influence stabilization with experiment conversion rate stabilization for support.  Widemile Optimize shows this using graphs, so I simply look for horizontal trending of lines, meaning winning levels and experiments stay winners and their level of influence or conversion rates stay fairly constant (look horizontal) over 3-5 days.  If you don&#8217;t have graphs available,  the historical cumulative conversion rate for your experiments and see if there is a lot of variance between the latest few days of your test.</p>
<p><strong>Content rules:</strong></p>
<ol>
<li>Every item you test should answer an important question</li>
<li>Test variety not quantity</li>
<li>Test opposites first then refine</li>
<li>Remember you can run more than one test</li>
</ol>
<p>The content rules are closely tied together.  In effect, they ensure that the items selected for testing have purpose and that they don&#8217;t needlessly expand the size of your test, reducing its efficiency.  I begin designing tests by creating hypothesis regarding issues with the page and then choose factors and design levels to address those issues.</p>
<p>An example hypothesis is &#8220;Having a hero shot on the right side of the page causes users to ignore the important value proposition on the left side.&#8221;  To test this, I would choose hero shot position as a factor and then have &#8220;left side hero shot&#8221; as the baseline level and &#8220;right side hero shot&#8221; as the second level.  This example also illustrates that, other than headlines and images, testing layout is possible with creative use of CSS and sometimes JavaScript.  As long as you can revert from one to another and it matches the other factors and levels, you are at liberty to test anything.</p>
<p>Coming back to the rules, make sure that you are testing as few items as possible to find out what you need.  Before testing a collection of lifestyle hero shots, choose one and test it against an iconic hero shot.  This will save you the time of going down a path of testing something that may not work.</p>
<p>Lastly, you aren&#8217;t going to be able to get the best page on the first run or even second, third, etc.  If you knew what your audience liked 100% of the time then you wouldn&#8217;t need testing.  Remember to think of your overall test plan beyond just the first run, so that you can answer all the questions you need without having to force everything into one test.</p>
<p>In summary, determine what you&#8217;re trying to achieve, select the proper testing method to meet those goals and then make sure to be purposeful and efficient with the content you end up testing in front of your visitors.  Testing and optimization is not difficult, although it can be tough to start.  Follow these rules and you&#8217;ll be on your way to conquering conversion rates, bounce rates, funnel drop-offs and many other metrics.</p>
<p>Photo credit: <a href="http://www.flickr.com/photos/arandalasch/3182768438/">Aranda\Lasch</a> (<a href="http://creativecommons.org/licenses/by-nc-nd/2.0/deed.en">CC</a>)</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.webtrends.com/optimization/2009/06/16/rules-successful-multivariate-test-billys-optimization-guide-part-3/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Are your visitors telling you if you&#039;re getting hotter or colder?</title>
		<link>http://blogs.webtrends.com/optimization/2008/11/13/visitors-telling-hotter-colder/</link>
		<comments>http://blogs.webtrends.com/optimization/2008/11/13/visitors-telling-hotter-colder/#comments</comments>
		<pubDate>Thu, 13 Nov 2008 18:30:37 +0000</pubDate>
		<dc:creator>Billy Shih</dc:creator>
				<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Testing Techniques]]></category>
		<category><![CDATA[test design]]></category>

		<guid isPermaLink="false">http://testingblog.widemile.com/?p=271</guid>
		<description><![CDATA[
In elementary school, I played the game Hot or Cold in class.  The rules of the game are simple:

One child is picked as the &#8220;searcher&#8221; and leaves the room
The class collectively chooses an object in the room, like a marker or eraser, for the searcher to find
Once the object is selected, the searcher returns to [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blogs.webtrends.com/optimization/files/2008/11/classroom.jpg"><img class="aligncenter size-full wp-image-290" src="http://blogs.webtrends.com/optimization/files/2008/11/classroom.jpg" alt="" width="450" height="300" /></a></p>
<p>In elementary school, I played the game Hot or Cold in class.  The rules of the game are simple:</p>
<ul>
<li>One child is picked as the &#8220;searcher&#8221; and leaves the room</li>
<li>The class collectively chooses an object in the room, like a marker or eraser, for the searcher to find</li>
<li>Once the object is selected, the searcher returns to the room and has to find the mystery object as quickly as possible</li>
</ul>
<p>To help the searcher out, the other kids in the room scream hot, if the searcher gets closer to the object, or cold, if they get farther.</p>
<p>To make the game more challenging, the searcher might be limited to only one clue, just hot or just cold.  Kids that were told both hot and cold found the objects fairly quickly, but if they were only allowed one type of feedback, it took them much longer.</p>
<p>For the same reasons that it is hard to find the object in the game without being told where it is closer and farther from, in testing, if you don&#8217;t design your tests with two distinct variations, you might go wandering for a long time trying to find what exactly your customer wants.</p>
<p>My metaphor fails in one way though.  In the game, the searcher does find the object eventually, even with just one type of hint.  However, If you don&#8217;t design tests correctly though, you may never find a page that resonates strongly with the audience.  You might test dozens of testimonials and find the most successful testimonial, but if you never test it against no testimonial or a review, you may be missing out on even bigger gains.</p>
<p>Let your audience tell you hot and cold by designing your tests intelligently and they&#8217;ll help you find the optimal page faster than ever.</p>
<p>Photo credit: <a href="http://flickr.com/photos/airport/6550520/">Night Owl City</a> <a href="http://creativecommons.org/licenses/by-nc-sa/2.0/deed.en">CC</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.webtrends.com/optimization/2008/11/13/visitors-telling-hotter-colder/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>An Essential Primer on Full and Fractional Factorial Test Design</title>
		<link>http://blogs.webtrends.com/optimization/2008/07/24/primer-full-and-fractional-factorial-test-design/</link>
		<comments>http://blogs.webtrends.com/optimization/2008/07/24/primer-full-and-fractional-factorial-test-design/#comments</comments>
		<pubDate>Thu, 24 Jul 2008 17:14:11 +0000</pubDate>
		<dc:creator>Billy Shih</dc:creator>
				<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Terminology]]></category>
		<category><![CDATA[Testing Techniques]]></category>
		<category><![CDATA[design of experiments]]></category>
		<category><![CDATA[fractional factorial]]></category>
		<category><![CDATA[full factorial]]></category>
		<category><![CDATA[interactions]]></category>
		<category><![CDATA[partial factorial]]></category>
		<category><![CDATA[test design]]></category>

		<guid isPermaLink="false">http://testingblog.widemile.com/?p=194</guid>
		<description><![CDATA[
What are full and fractional factorial test designs? How do they relate to optimization and what about interactions?
Once you get down and dirty with testing, these questions matter. Whether selecting an optimization platform or trying to thoroughly understand the tests you are building, grasping these concepts will put you in greater control and allow you [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center"><img class="alignnone size-full wp-image-202 aligncenter" src="http://blogs.webtrends.com/optimization/files/2008/05/keys.png" alt="" width="271" height="192" /></p>
<p>What are full and fractional factorial test designs? How do they relate to optimization and what about interactions?</p>
<p>Once you get down and dirty with testing, these questions matter. Whether selecting an optimization platform or trying to thoroughly understand the tests you are building, grasping these concepts will put you in greater control and allow you to design and analyze your tests more effectively.</p>
<p>As simply as possible, I hope to educate you and other marketers about full and fractional factorial test designs and why <strong>fractional factorial is the best choice </strong>for multivariate testing of online campaigns.</p>
<p><em>Note: &#8220;Partial factorial” and “fractional factorial” are the same. Also, if you don&#8217;t have a thorough understanding of </em><a href="http://testingblog.widemile.com/optimization-glossary/experiment/"><em>experiments</em></a> <em>and </em><a href="http://testingblog.widemile.com/optimization-glossary/interaction/"><em>interactions</em></a><em>, please read those first.<br />
</em></p>
<p>The tests used in optimization are from the design of experiments field. (From <a href="http://en.wikipedia.org/wiki/Design_of_experiments">Wikipedia</a>: “<em>Design of experiments is the design of all information-gathering exercises where variation is present, whether under the full control of the experimenter or not.</em>”) The two types of tests I will focus on are fractional factorial and full factorial.</p>
<p>Here is an example I will use to explain these concepts.   Below is a test matrix outlining a test for a landing page with 5 factors with 2 levels each. Don&#8217;t let the vocabulary scare you away, this means that there are 5 parts of the page being tested and 2 variations of each.</p>
<p style="text-align: center"><a href="http://blogs.webtrends.com/optimization/files/2008/05/matrix.png"><img class="size-full wp-image-201 aligncenter" src="http://blogs.webtrends.com/optimization/files/2008/05/matrix.png" alt="" width="456" height="121" /></a></p>
<p style="text-align: center"><strong>Recipe Matrix:</strong> 5 factors = 5 parts (hero shot, headline, etc.) and 2 levels = 2 variations</p>
<p>These factors and their respective levels make up the possible combinations for a landing page. The combinations displayed are called <a href="http://testingblog.widemile.com/optimization-glossary/experiment/">experiments</a>.</p>
<p>Let&#8217;s calculate the total number of experiments possible (even if you know how to do this already, this is important to understanding the distinction between fractional and full factorial.) There are 2 levels for each factor, so you can have 2&#215;2x2&#215;2x2 (2 to the 5th power) = 32 possible experiments. This means there are exactly 32 combinations of hero shots, headlines, sub headlines, button text and main copy from our matrix outlined above. Note that if we add another factor, it becomes 2 to the 6th power or 64 possible experiments. Additionally, if you add 2 more levels to any of the existing 5 factors, it will increase from 32 to 4&#215;2x2&#215;2x2 = 64 experiments also.</p>
<p>In testing, each experiment must get a minimum amount of measurable conversions, known as the sample size per experiment. This ensures that there is enough data for a solid statistical analysis. Therefore the more experiments you have, the more conversions you need. You can think of conversion data as time also, since the longer you leave your web page up, the more data you get.</p>
<p>Now we&#8217;re ready to go back to the difference between the two test designs. Full factorial testing requires that every possible experiment combination is shown, so our 5-factor test would need to display all 32 experiments. This means that if there is a sample size of 100 conversions, 3,200 conversions will be required. Fractional factorial works differently, it displays a much smaller number of experiments, about 8 in this case, so it would need about 800 conversions.</p>
<p>Since full factorial gathers additional data, it reveals all possible <a title="Extended definitions for interactions" href="http://testingblog.widemile.com/optimization-glossary/interaction/">interactions</a>, but as seen by the numbers above, there is a trade-off. <strong>More data equals more information but more data also equals a longer test duration.</strong> The minimum data requirements for full factorial are very high since you are showing every experiment.</p>
<p>Even if you are using full factorial to get the same amount of information as a fractional factorial test, it will take more time since you need more data to see statistically relevant differences between the many experiments.</p>
<p>You might be wondering how fractional factorial can be accurate if interactions are possible?</p>
<p>Random interactions of high relevance are very rare, especially when looking for interactions of more than 2 factors. You really need to design tests where you look for meaningful interactions that are based on true business requirements rather than hoping for a random and low influence interaction between a red button, a hero shot and a headline.</p>
<p>Whatever the interaction is, you need to be able to understand your audience and infer why there was an interaction in the first place, only then are you ready to start designing for interactions.</p>
<p>Tests should not be filled with random levels, they should be carefully designed for success by focusing on testable hypotheses around the audience.  Could a 1 pixel drop shade on a button interacting with the copyright statement ever be truly significant, and not a victim of random error? Is it worth sacrificing thousands of conversions to learn a lesson that won’t result in any relevant increase of real world conversions?</p>
<p>There are interactions that might make sense and those that should be avoided from being measured because of the amount of testing time it adds.</p>
<p>This brings me to fractional factorial.  <strong>It is possible for <span>fractional factorial tests to detect interactions</span></strong>. How so? Using our example of a 5-factor test, fractional factorial can include everything from only main-effects all the way to 4-factor interaction effects. Full factorial’s only difference is that it is the full extension and includes the 5-factor interaction effects.</p>
<p>Fractional factorial is not a one-trick pony, it is a continuum ranging from testing for no interactions (only main effects) to one factor less than full factorial. It is exactly what the name fractional implies; even one less is a &#8220;fraction&#8221; of full factorial. It gives you the power to make trade-offs between testing only main effects to testing for interactions based on intelligent test design.</p>
<p>Once you decide to test for all possible interactions, you are committing to a full-factorial test and incur the associated traffic requirements. I’d love to see a test design that is designed for full interactions and still makes sense! Not having the ability to reduce the number of interactions is a huge detriment rather than a benefit of solutions limited to full-factorial testing.</p>
<p>Radically shorter test times allow for many more smart marketing ideas to be tested and <strong>adapted</strong> based on what you learn from each test run. You, the marketer have the ability to analyze your results and tweak follow-on tests to capitalize on what you<span> </span>learn. This common-sense approach is what hypothesis-based testing is all about and is very powerful. Focus on testing smart ideas to increase your conversion rate – that’s what matters most.</p>
<p>The graph below illustrates how much information is gained and the amount of testing needed, based on the number of interactions tested.</p>
<p style="text-align: center"><a href="http://blogs.webtrends.com/optimization/files/2008/05/effects-graph.png"><img class="alignnone size-full wp-image-203 aligncenter" src="http://blogs.webtrends.com/optimization/files/2008/05/effects-graph.png" alt="" width="476" height="427" /></a></p>
<p>In my experience, the red area shows how valuable the data is based on which effects are being tested, while the blue area shows the amount of data (or time) needed to gather the data to confirm those effects. The x-axis goes from left to right, from main effects to full factorial (5-factor effects).</p>
<p>At Widemile, we believe it is more effective to perform quick, successive tests detecting only main-effects rather than randomly hoping for interactions. While interactions might give you small or even large gains, it likely will never not trump the gains from additional testing, nor the time and money lost looking for random interactions. The additional time required for full factorial tests is large and not many marketers want to wait more than a month for a test to complete.</p>
<p>Fractional factorial is preferred by a few camps, including <a href="http://www.widemile.com/">Widemile</a>, Omniture&#8217;s <a href="http://www.omniture.com/en/products/conversion/testandtarget">Test&amp;Target</a> (formerly Offermatica) and Interwoven&#8217;s <a href="http://www.optimost.com/">Optimost</a>. Full factorial is used in Google&#8217;s free <a href="http://www.google.com/websiteoptimizer/">Website Optimizer</a> and some tools offered by smaller providers.</p>
<p>Testing for all interactions sacrifices a lot of time. With the speed that audiences, marketing campaigns and seasons can change, it is important to get the most testing done in the least amount of time without sacrificing the quality of the data. Fractional factorial allows you to do just that, making it the wisest choice for multivariate testing.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.webtrends.com/optimization/2008/07/24/primer-full-and-fractional-factorial-test-design/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>3 difficult optimization results and what you can learn from them (3 of 3)</title>
		<link>http://blogs.webtrends.com/optimization/2008/04/30/difficult-optimization-results-learn-3/</link>
		<comments>http://blogs.webtrends.com/optimization/2008/04/30/difficult-optimization-results-learn-3/#comments</comments>
		<pubDate>Wed, 30 Apr 2008 17:16:19 +0000</pubDate>
		<dc:creator>Billy Shih</dc:creator>
				<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Testing Concerns]]></category>
		<category><![CDATA[Testing Techniques]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[analyzing tests]]></category>
		<category><![CDATA[mistakes]]></category>
		<category><![CDATA[optimization results]]></category>
		<category><![CDATA[stabilization]]></category>
		<category><![CDATA[test design]]></category>
		<category><![CDATA[test results]]></category>

		<guid isPermaLink="false">http://testingblog.widemile.com/?p=185</guid>
		<description><![CDATA[Note: This is the third post of a 3 part series, each focusing on one type of test result that is tough to deal with. Read the other 2 articles on highly mixed data and the original page beating the new variations.

Ready for the toughest of all test results?  I brought in Widemile&#8217;s Chief [...]]]></description>
			<content:encoded><![CDATA[<p><em>Note: This is the third post of a 3 part series, each focusing on one type of test result that is tough to deal with.</em> <em>Read the other 2 articles on <a href="http://testingblog.widemile.com/2008/03/31/difficult-optimization-results-learn-1/">highly mixed data</a> and <a href="http://testingblog.widemile.com/2008/04/10/difficult-optimization-results-learn-2/">the original page beating the new variations</a>.</em></p>
<p style="text-align: center"><img class="size-full wp-image-196 aligncenter" src="http://blogs.webtrends.com/optimization/files/2008/04/unstableground.jpg" alt="" width="240" height="192" /></p>
<p>Ready for the toughest of all test results?  I brought in Widemile&#8217;s Chief Scientist, Vladimir Brayman, for this post to help me with some of the concepts around this topic.  The last of the three results is when the <strong>results just won&#8217;t stabilize</strong>.</p>
<p><strong>How does this happen?</strong><br />
As long as you have homogeneous traffic and enough time, a test should stabilize.  Unfortunately, this is not always possible and I don&#8217;t know anyone with unlimited time.  The most obvious way this occurs is when a test is designed too large, meaning you don&#8217;t have enough conversion traffic for the number of variations you are trying to test.</p>
<p>Additionally, getting homogenous traffic is not always easy.  If your sources are too different, you can have problems.  Text, banner, e-mail ads and even Yahoo vs Google traffic may behave differently.  The worst case is when these sources of traffic are added mid-test.  I have had tests where an e-mail campaign was done at the end of a test without my knowledge (until I asked about the huge spike in traffic!)</p>
<p>You can&#8217;t control all traffic coming to your page from some sources like PR, blogs, seasonal events and news.  This goes back to <a href="http://testingblog.widemile.com/2008/03/31/difficult-optimization-results-learn-1/">part 1, about highly mixed data</a>; everything there applies to this case too.</p>
<p>A test also may not stabilize because the test is designed with elements that are too similar. The same thing can happen when 2 elements are different but have approximately the same amount of impact. In these situations, your data will go back and forth on which of them are the winners.</p>
<p>Anything outside of your page that has a large influence can destabilize your test, this includes pieces of your funnel.  One symptom of this is when your clickthroughs are fairly consistent but the full conversions are not.  If you are testing a landing page and the sign-up process after it is very kludgey and difficult for users then it can have a large impact on your tests&#8217; ability to stabilize.  This is especially true if the experience for visitors changes.  An example of this is visitors bailing from a purchase funnel because shipping to their area is prohibitively more expensive than other areas.  Although they would have converted if shipping was within the average price range, they ended up not converting because of something encountered outside of the landing page, skewing your results.  This is in almost every test, but the magnitude of its impact depends on what exactly occurs.</p>
<p><strong>What can you do to prevent this?</strong></p>
<p>If you are using a testing tool different from what you normally track your conversions with, make sure you run a baseline test so that you can compare the numbers your testing tool gives you with the ones your conversion analytics produces.  They should be within about 10%-15% of each other over about a week or so.  Finding a large discrepancy here will save you from headaches down the line.  This essentially double checks the expected traffic numbers by ensuring you are measuring your current conversion correctly, which allows you to design a test of the appropriate size.  By size, I mean ensure that you have enough testing time and within that time you will get enough traffic.</p>
<p>While easier said than done, it is important to look for new traffic that may be driven to your page and to segment it out.  Since this shares some of the same problems as highly mixed data, <a href="http://testingblog.widemile.com/2008/03/31/difficult-optimization-results-learn-1/">those solutions apply here too</a>.</p>
<p><strong>What can you do if this happens?</strong><a href="http://blogs.webtrends.com/optimization/files/2008/04/spinning.jpg"><img class="alignright size-medium wp-image-195" style="float: right" src="http://blogs.webtrends.com/optimization/files/2008/04/spinning.jpg" alt="" width="172" height="175" /></a></p>
<p>First, don&#8217;t cut your tests short unless you think more data won&#8217;t solve the problem.  If you don&#8217;t reach stabilization, you are wasting all the time you tested since you have inconclusive data.  Always try to be as conservative as possible and end tests only when you are very confident that the test is stabilized or that there is no other choice.</p>
<p>Think about restarting the test if it isn&#8217;t stable.  Use a smaller design.  Pick the important factors (pieces) and the levels (variations) that you think will perform and are drastically different from each other.  This prevents elements from looking unstable as they flip flop as the optimal.</p>
<p>If your only problem is that 2 variations are vying for the winning position, then they likely perform about the same.  It probably is not really worth your time to wait for them to stabilize and so stopping the test and going with either of them likely will have little difference to your conversion rates.</p>
<p>The problem of outside funnel influence is a bit harder, but not impossible to solve.  The best solution is to segment the users that are determined to be unqualified.  For example, if you only ship or work with US customers and businesses, then filter out any users that are outside of the US and do your analysis from there.  This can be done either at the data level if you can tell where the data came from, otherwise this can be done with a splitter or qualification page that leads people into the appropriate funnel first.  This may impact your overall conversions itself though, so careful testing around these methods should be done as well.</p>
<p>From my experience, the problems I&#8217;ve listed in these three posts are either preventable or unlikely to occur. The value of having an optimization expert is because they can avoid these situations or at the very least extract useful lessons when they do happen.  Having said that, don&#8217;t be scared to test.  Once you get the hang of it, it is a lot of fun and one of the keys to effectively growing and maturing your online marketing campaigns.<a title="jurvetson" href="http://www.flickr.com/photos/44124348109@N01/462206324/" target="_blank"></a></p>
<p><a title="Attribution License" href="http://creativecommons.org/licenses/by/2.0/" target="_blank">CC</a> <a href="http://www.photodropper.com/photos/" target="_blank">photo</a> credit #1: <a title="ryaninc" href="http://www.flickr.com/photos/93086051@N00/672333312/" target="_blank">ryaninc</a> &#8211; <a title="ryaninc" href="http://www.flickr.com/photos/93086051@N00/672333312/" target="_blank"></a><a title="Attribution License" href="http://creativecommons.org/licenses/by/2.0/" target="_blank">CC</a> <a href="http://www.photodropper.com/photos/" target="_blank">photo</a> credit #2: <a title="jurvetson" href="http://www.flickr.com/photos/44124348109@N01/462206324/" target="_blank">jurvetson</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.webtrends.com/optimization/2008/04/30/difficult-optimization-results-learn-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Great resource for landing page optimization</title>
		<link>http://blogs.webtrends.com/optimization/2008/04/11/great-resource-for-landing-page-optimization/</link>
		<comments>http://blogs.webtrends.com/optimization/2008/04/11/great-resource-for-landing-page-optimization/#comments</comments>
		<pubDate>Fri, 11 Apr 2008 16:57:15 +0000</pubDate>
		<dc:creator>Billy Shih</dc:creator>
				<category><![CDATA[Landing Page Optimization]]></category>
		<category><![CDATA[campaign optimization]]></category>
		<category><![CDATA[case study]]></category>
		<category><![CDATA[learning]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[MarketingExperiments]]></category>
		<category><![CDATA[test design]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://testingblog.widemile.com/?p=190</guid>
		<description><![CDATA[I just received a link to an amazing resource from MarketingExperiments, it&#8217;s a compilation of great webinar summaries and case studies that they have done.  They cover topics from landing page optimization to price testing to PPC and more.  While not everything is about testing specifically, all their advice and ideas can be [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blogs.webtrends.com/optimization/files/2008/04/marketingexperiments.jpg"><img class="alignnone size-full wp-image-191 alignright" style="float: right" src="http://blogs.webtrends.com/optimization/files/2008/04/marketingexperiments.jpg" alt="" width="150" height="126" /></a>I just received a link to an amazing resource from MarketingExperiments, it&#8217;s <a href="http://www.marketingexperiments.com/rchiveii.html">a compilation of great webinar summaries and case studies</a> that they have done.  They cover topics from landing page optimization to price testing to PPC and more.  While not everything is about testing specifically, all their advice and ideas can be tested, which is why I think you all will find it valuable.</p>
<p>All testing should be carefully designed; it should be focused on best practices and tactics that are predicted to connect with the audience.  You <em>should take risks</em> when testing, but they should be calculated risks.</p>
<p><a href="http://www.marketingexperiments.com/rchiveii.html">Check it out</a> and soak up some knowledge on optimization and get ideas to test on your site.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.webtrends.com/optimization/2008/04/11/great-resource-for-landing-page-optimization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>3 parts to picking a test page</title>
		<link>http://blogs.webtrends.com/optimization/2008/03/26/3-parts-to-picking-a-test-page/</link>
		<comments>http://blogs.webtrends.com/optimization/2008/03/26/3-parts-to-picking-a-test-page/#comments</comments>
		<pubDate>Wed, 26 Mar 2008 19:29:54 +0000</pubDate>
		<dc:creator>Billy Shih</dc:creator>
				<category><![CDATA[Methodology]]></category>
		<category><![CDATA[goals]]></category>
		<category><![CDATA[roi]]></category>
		<category><![CDATA[starting out]]></category>
		<category><![CDATA[test design]]></category>

		<guid isPermaLink="false">http://testingblog.widemile.com/2008/03/26/3-parts-to-picking-a-test-page/</guid>
		<description><![CDATA[&#160;

Alright, so you&#8217;re ready to test.  You&#8217;ve got tools and the skills to design a test.  But when optimization begins, where should you start? While we all would love ROI to be the only driving factor in optimization, your resources and reach usually dictate what you end up testing, as well as ROI.
Here&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center">&nbsp;</p>
<p style="text-align: center"><img src="http://blogs.webtrends.com/optimization/files/2008/03/curvedstream.jpg" alt="Curved Stream" height="266" width="200" /></p>
<p>Alright, so you&#8217;re ready to test.  You&#8217;ve got tools and the skills to design a test.  But when optimization begins, where should you start? While we all would love ROI to be the only driving factor in optimization, your resources and reach usually dictate what you end up testing, as well as ROI.</p>
<p>Here&#8217;s what I think about when looking for candidate test pages:</p>
<ol>
<li><strong>Importance:</strong>  Is this the best page to accomplish your goals?  Take a look at your overall marketing campaign and see how important this is to the whole process.  Look at drop-off points and find pages that are important but weak links.<strong><br />
</strong></li>
<li><strong>Technical</strong>:  Will this page be easy to optimize? How much technical involvement will it require?  If you&#8217;re testing dynamic elements you may need some additional help.  Maybe you can test a page outside of the development schedule or on a separate server.  Look for pages that have less restrictions and can be modified quickly.  Also examine the page for what can&#8217;t be tested and what you may want to test on a page.  Some tests are harder to create than others, both technically and creatively.</li>
<li><strong>Goals:</strong>  What are you optimizing for?  Pages with one goal are easier to optimize since you can drive everything on the page towards achieving that one goal.  If you think a page is under performing, then it may be an easier page to optimize also.  Lastly, think about how easy that goal will be to measure.  If there are multiple conversion possibilities or the conversions are offline, it will be difficult to test.</li>
</ol>
<p>In the end you are asking a multi-part question: <em>Will a lift here be more valuable than a smaller/same/larger lift elsewhere that will take X amount of work and time?</em></p>
<p>Just remember, you can always test a page later on, even if it may not be the best candidate now.  If you have the resources you can test pages simultaneously too.   Just make sure they don&#8217;t impact each other in any way, so as to not skew your results.</p>
<p>As your optimize more and more, it will be harder to choose, but that&#8217;s a good sign.  It means you&#8217;ve got a lot of great pages and that is what you want testing to do for you.</p>
<p><em><a href="http://flickr.com/photos/nikonvscanon/2081397439/">Photo Source</a> (<a href="http://creativecommons.org/licenses/by/2.0/deed.en">under CC</a>)<br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.webtrends.com/optimization/2008/03/26/3-parts-to-picking-a-test-page/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>3 steps to quickly make a good multivariate test</title>
		<link>http://blogs.webtrends.com/optimization/2008/02/21/3-steps-to-quickly-make-a-good-multivariate-test/</link>
		<comments>http://blogs.webtrends.com/optimization/2008/02/21/3-steps-to-quickly-make-a-good-multivariate-test/#comments</comments>
		<pubDate>Thu, 21 Feb 2008 23:22:45 +0000</pubDate>
		<dc:creator>Billy Shih</dc:creator>
				<category><![CDATA[Methodology]]></category>
		<category><![CDATA[Testing Concerns]]></category>
		<category><![CDATA[multivariate testing]]></category>
		<category><![CDATA[test design]]></category>

		<guid isPermaLink="false">http://testingblog.widemile.com/2008/02/21/3-steps-to-quickly-make-a-good-multivariate-test/</guid>
		<description><![CDATA[Having great testing technology puts a lot of power in your hands.  You can test anything and everything you want.  However, like any other tool, to use it effectively you have to use it right.  There&#8217;s a lot of best practices and thought that goes into test design, but following these three [...]]]></description>
			<content:encoded><![CDATA[<p>Having great testing technology puts a lot of power in your hands.  You can test anything and everything you want.  However, like any other tool, to use it effectively you have to use it right.  There&#8217;s a lot of best practices and thought that goes into test design, but following these three rules can get you a good test in most situations.</p>
<div style="text-align:center"><img src="http://widemile.files.wordpress.com/2008/02/steps.jpg" alt="Steps" width="211" height="280" /></div>
<ol>
<li><strong>Maximize your traffic</strong>: Pack as much as you can into a test for the amount of traffic you have to keep it a short test.  Using Widemile&#8217;s platform that&#8217;s 2 weeks to be safe, with Google Optimizer you should do at least a month (<a href="http://testingblog.widemile.com/2008/01/28/why-google-optimizer-is-free-its-old-and-slow/">explanation</a>).</li>
<li><strong>Test opposites:</strong> If you test stuff that&#8217;s similar, they&#8217;ll perform about the same.  So find out the general theme you should be following first by testing opposites (B2B vs B2C, podcast vs ebook, descriptive vs benefits).</li>
<li><strong>Learn from the previous test:</strong> Always make sure you line up your tests so that you learn something that can be used in the next one to either refine or to learn something new.</li>
</ol>
<p>The goal of these three things are to maximize your time spent testing by testing as much as possible while also minimizing testing suboptimal content.  For example, if I was selling iPods and I tested 2 images of people running with the iPod, one with a man and the other a woman, I might think that was a good test.  However I could have totally missed out on an image that worked better, such as an iPod next to a PC.  I could test that out after the initial test, but then I just wasted one test run.  The right way would be to test one sport image versus one PC image and find out which direction to go.  From there I could test against other opposing images or refine the PC image.</p>
<p>The only warning I&#8217;d throw in is that if you&#8217;re trying to test a lot of things at once, you might want to scale back.  Pick a 2-4 themes depending on your test size and stick to testing them out.  Don&#8217;t mix and match.</p>
<p>Follow these steps and you&#8217;re on your way to getting not quick tests, but efficient ones.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.webtrends.com/optimization/2008/02/21/3-steps-to-quickly-make-a-good-multivariate-test/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What is Taguchi?  How does it relate to testing?</title>
		<link>http://blogs.webtrends.com/optimization/2008/02/14/what-is-taguchi-how-does-it-relate-to-testing/</link>
		<comments>http://blogs.webtrends.com/optimization/2008/02/14/what-is-taguchi-how-does-it-relate-to-testing/#comments</comments>
		<pubDate>Thu, 14 Feb 2008 16:00:21 +0000</pubDate>
		<dc:creator>Billy Shih</dc:creator>
				<category><![CDATA[Terminology]]></category>
		<category><![CDATA[design of experiments]]></category>
		<category><![CDATA[fractional factorial]]></category>
		<category><![CDATA[multivariate testing]]></category>
		<category><![CDATA[taguchi]]></category>
		<category><![CDATA[taguchi method]]></category>
		<category><![CDATA[test design]]></category>

		<guid isPermaLink="false">http://testingblog.widemile.com/2008/02/14/what-is-taguchi-how-does-it-relate-to-testing/</guid>
		<description><![CDATA[
Multivariate testing is a buzz word these days, but the buzzword of multivariate testing seems to be Taguchi.  However, that term is being abused.  Do you know what Taguchi really means? I wasn&#8217;t even positive, so to get some background, I did some research and talked with Vladimir (Widemile&#8217;s Chief Scientist).
The name and [...]]]></description>
			<content:encoded><![CDATA[<div style="text-align:center"><img src="http://widemile.files.wordpress.com/2008/02/taguchi.png" alt="the Taguchi method" /></div>
<p>Multivariate testing is a buzz word these days, but the buzzword of multivariate testing seems to be Taguchi.  However, that term is being abused.  Do you know what Taguchi really means? I wasn&#8217;t even positive, so to get some background, I did some research and talked with Vladimir (Widemile&#8217;s Chief Scientist).</p>
<p>The name and method comes from Genichi Taguchi.  His method, also known as Robust Design, attempted to improve product manufacturing quality.  Therefore it falls into an area of engineering called Quality Engineering.</p>
<p>Does this sound aligned with website testing?  Not really, and this is the problem of using the term Taguchi with web site testing.  The goals of manufacturing and the goals of a website are not the same.</p>
<p>What most people are attempting to grasp when using the term Taguchi is <em>fractional factorial test design</em>.  (I discussed this at length in my post about <a href="http://testingblog.widemile.com/2008/01/28/why-google-optimizer-is-free-its-old-and-slow/">the difference between Widemile&#8217;s technology and Google Optimizer</a>.) The Taguchi method uses a fractional factorial test design and is under the umbrella of fractional factorial testing but is not the only or best fractional factorial method.  In fact, even within manufacturing, the Taguchi method was the inspiration for many new techniques but many statisticians find it flawed.*</p>
<p>It is important to differentiate the Taguchi method from fractional factorial test design since one is a basis for manufacturing while the other is purely related to design of experiments.   You need to ensure that the math and science behind your testing is based on methods that have the end goal of optimizing your website only.  So if your testing tool uses the Taguchi method for testing, you better ask what that really means.</p>
<p>So does Widemile use Taguchi?  We don&#8217;t use the Taguchi method, however do use fractional factorial test design.  I like to say that our platform goes beyond Taguchi because it was specifically made for optimizing web content.</p>
<p>Don&#8217;t get sucked into the Taguchi method, it is just a buzzword used by your fellow marketers.  Just because the technology doesn&#8217;t use Taguchi, doesn&#8217;t mean you should count it out.</p>
<p>*Read more after the jump for Vladimir&#8217;s explanation of the Taguchi method and its criticisms<br />
<span id="more-166"></span> The following is written by Vladimir Brayman, Chief Scientist at Widemile.  If you have any questions for him or I, leave a comment and I will try to get back to you ASAP.</p>
<p>Sometimes the term Taguchi method is used mistakenly to mean fractional factorial design. In fact, the Taguchi method is much narrower in both its scope and objectives. The Taguchi method (also known as robust design) belongs to an engineering discipline called Quality Engineering. The main objective of the Quality Engineering design is to minimize variability in the performance of a product under different environmental conditions.  The main characteristics of the Taguchi method  stem from that objective. Among the steps involved in the Taguchi method are:</p>
<ol>
<li>Defining two types of factors, control and noise. Control factors can be manipulated by a production team during the manufacturing process whereas noise factors model environmental impacts on the product and thus cannot be controlled precisely.</li>
<li>Defining two orthogonal arrays – usually with mixed levels and of strength 2 (this implies that only main effects can be detected) – one array for the control factors and the other for the noise factors.</li>
<li>Maximizing the signal-to-noise ratios, a logarithmic function of the ratio between the square of the average responses due to the control factors and the estimate of the variance due to the noise factors.</li>
</ol>
<p>Statisticians criticized unjustified claims of almost limitless applicability of the Taguchi method by some of the researchers. Among the critiques are:</p>
<ol>
<li> There is no possibility of detecting interactions among the control factors.</li>
<li>There are N1*N2 observations needed, where N1 is the number of level combinations of the control array and N2 is that of the noise array. However the confounding structure for the control factors is the same as that of an array of size N1. This implies that the same resolution can be obtained with much smaller number of runs.</li>
<li>The influence of the noise factors on the response variables cannot be detected.</li>
</ol>
<p>To conclude, some people mistakenly call fractional factorial design Taguchi method. Use of the genuine Taguchi method for Landing Page Optimization is not justified.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.webtrends.com/optimization/2008/02/14/what-is-taguchi-how-does-it-relate-to-testing/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>
