the t-test was developed by william sealy gosset “hes working” at the Guinness brewery over a hundred years ago and he developed this measure to determine things like the difference between barley yield now he wanted to publish this statistical research to share with other statisticians but the brewer was fearful they didn’t want him to publish they didn’t want him to give away any mysteries he ultimately reassured them but he had to publish under the pseudonym student so instead of this being known as the Gossett’s t-test it’s known as the so in this video I’m going to start by conceptually demo you what the t-test is I’ll then establish you how to calculate the T significance led a t-test and then finally how to do a t-test in just a few seconds abusing a spreadsheet so imagine I have two fields of barley field one and discipline two and I want to compare them but I don’t want to cut down the whole field I time want to do some samples samples from one tests from two so I could get a sample from battlefield one now won’t be a excellent regular delivery like this it’s going to be more of a histogram that looks like that and then I’m going to get a sample from province two now which one of these has a higher yield well we could figure out the mean we could figure out the average of each of those samples and it looks like the average in province two is higher than the average in land one but that’s only part of the picture the aim merely tells us so much because we could have different dispensations and depending on that distribution or the variance within that test there could be a statistically significant difference between the two or not and that’s where the T value comes in handy it’s really a ratio of signal to sound signals going to be digits that tell me the difference between these two samples and noise going to get amounts that kind of get in the way so how do I figure out the signal well the most effective way to do that is simply find the difference between the two means and so if I calculate the mean and sample one we’ll call that X disallow one and X bar two the absolute cost or the difference between the two is going to tell you how much signal it there is how much change there are at present how do we get at the interference that’s going to be in the variability of the groups themselves and so the factor is going to look something like this what is s 1 that’s the standard deviation remember that’s how far our data is spread from the mean but we’re not only again the standard deviation we’re actually squaring that that opens us something called the variance and so if I increase the variance that’s going to lower my significance it’s like giving me no more noise now the other factor in here is going to be the number of samples that I’m taking as I increase the number of members of tests that will actually increase the signal up to a site and so again the distinctions between the makes is going to give us more signal higher T value and increasing that variability is actually going to decrease it so let me show you how to calculate that T value I’m abusing Excel but you could use Google sheets or even your TI calculator so if you look at these two tests from field1 and field2 can you tell which one has a higher yield it’s really hard just looking at it is there a difference between the two and how much is that difference I will use the T significance to calculate that first thing “were supposed to” figure out is the aim so in a spreadsheet you made equals median instead of mean I’m going to articulated left parenthesis and now I’m going to select that entire samples gave from subject one and parenthesis and then I’m going to get a mean of fifteen point three eight now I can select that and lag over and now I get a aim of fifteen point six eight in orbit two next thing I have to figure out is my standard deviation so that’s equals stdev left parenthesis I’m now going to sample that study one and now I lay in parentheses so we’re going to have a standard deviation of 0.3 one to four I’m now going to apply that into realm twos data set so we have a higher standard deviation remember I now have to calculate the deviation to do that you have to square the standard deviation so I’m going to select that cadre and imparting it to the second power so there’s my variance for environment one and now here’s my variability for domain two and then finally I have to know how much data I’m actually collecting so if you touched equals count that’ll count the number of data and so I’m going to count those and we get 16 so we got 16 and then it’s going to be 16 in the next one as well now you could use a spreadsheet to calculate this you could do it by hand it takes a long time to figure out standard deviation by hand so it I hearten you to use something like a spreadsheet now I have all these values I’m simply going to plug it into my T ethic like that so we’ve got the signal on the top so I’m going to find the distinctions between these two and then I’m going to figure out my interference on the bottom remember you have to divide this add it and then take the square beginning of that so if I do the work for you we’ve got a signal of 0.3 0 an o 0 target 1 3 so I’ve got a T quality of 2 spot 3 what does that mean since it’s greater than 1 that necessitates there’s more signal there is noise so I’m going to settle that over here to the side because this video is not concerned with the students T appreciate it’s about the students t-test so now we’re going to run a t-test what are we experimenting we’re testing our null hypothesis just like we do in a chi-square test what we’re going to start with is a null hypothesis that says there’s no statistically significant difference between the tests in other words any difference that we would find is B simply due to chance you then determine a critical value a number if our T value is lower than that then we don’t rebuff our null hypothesis but if we get a T evaluate that’s higher than the critical value then we are opposed to our null hypothesis there must be an alternate hypothesis there could be something going on between these two battlegrounds now how do we find that critical significance we’ll implementation a tea table that looks like this it looks confusing but it’s really not that bad so this would be for a two-tailed test and I’ll depict you what that implies in just a second first thing you have to know is what probability are we going to use generally in science we’ll use the point O 5 likelihood so that’s going to be this editorial right here what does that mean well this is an inferential statistic it conveys if we were to do this sample a hundred times 95 of the times we would reject the null hypothesis and simply 5 percent we wouldn’t and so it has a lot to do with chance so I’m going to use that site O 5 now we have to figure out what row we’re going to do and to do that we have to know how many samples we collected and figure out the degrees of impunity so different degrees of flexibility is going to be the tests of N 1 and n 2 minus 2 so since we made 16 from each it’s going to be 32 minus 2 or 30 severities of democracy so here’s our critical value our critical quality going to get 2 time 0 4 so is our T ethic higher than that so this is where we’re doing the actual T experiment are we higher than two target zero four we are and so what does that mean we’re going to reach our null hypothesis that signifies there is something statistically significant between these two tests positions now it’s not much higher than that remember it’s just 2.3 and if we were to look over here to the 0.025 probability we can see that we’re actually less than that of that so we’re not positive but we’re pretty sure that there’s something statistically significant between these two now that was a lot of work we had to calculate our variance our implies our sample size and then find this counter the neat thing about a spreadsheet is it can calculate a t-test very quickly so what I’m going to do is articulated t-test here and then I’ll exactly write in this next cell equals t-test so there are four things I have to put in for a t-test the first one is going to be my sample mount will say from realm 1 so I’m going to select that then I put a comma in now I’m going to seizure my data set from environment 2 and then I’m going to articulate another comma we’re doing a two-tailed test and this is an independent evaluation I’ll indicate you what that is in really a few seconds but you can see in just a few seconds we’ve calculated my probability or my p-value what is it it is point zero two six what does that planned it’s slightly above point zero two five and somewhere in between item zero five and extent zero two five what does that mean in just a few seconds we’re able to realize that we need to reject that null hypothesis so it’s really simple in a spreadsheet to do a t-test very quickly now we did an independent t-test or an unpaired sample what does that mean we had two different fields that we were comparing so you could be comparing for example two different populations you are eligible to move a paired t-test and you would have to select that when you’re running the t-test what would that be is if we’re sampling the same population twice so perhaps we’re looking at realm two but then we’re applying a chemical and looking at it again that would be a paired exam we’re too doing a two-tailed test and so when we’re figuring out that probability of 0.05 you can think of it like this this is the point nine five that we are to be able reject the null hypothesis and that phase O five is actually split between the two tushes because we’re not sure which direction that variance is going to be you could also run a one tailed assessment if you’re sure of the directionality but you have to be cautious when you’re running there are a few expectations you have to have when you’re running a t-test number one we should have a regular delivery in both specific populations and in the test but it drives really well with a small sample size we likewise should have same discrepancies in each of those tests and then when we’re looking at the data points we should have roughly the same number of data points on either sample and then finally this works good with low-spirited digits but you generally want to be in the 20 to 30 scope when we’re looking for tests if we go much higher than that instead of using a t-test we’d actually use a z-test so did you learn everything I proved you well now’s a chance to practice it I’ve got a sample initiate over on the left side imagine we have two weeds plants from person a and B and let’s say we’re looking at the leaves that each of those seeds have is there a statistical difference between those in plant a and flora B so you should extend a t-test I’ll settle a link to an exceed enter down below and then once you figure it out what are you trying to figure out again do we not scorn or do we reject the null hypothesis I’d love to know what you think articulated that in the comments down below and I hope that was helpful
Related posts
-
C# Tutorial For Beginners – Learn C# Basics in 1 Hour
Hi! Thank you for taking my C# tutorial for beginners. Let me quickly give you an... -
WordPress Tutorial: Build Your Affiliate Marketing Sales Funnel
Hi Kevin Barham now and in this shorttutorial I’m gonna show you how to build an... -
Python OOP Tutorial 1: Classes and Instances
Hey, everybody. How’s going in this streaks of videos? We’ll be learning how to create and...