This activity will focus on the location and amount of genetic
library( tidyverse )
library( gstudio )
data( arapat )
The data for this activity is included in the gstudio
library and represents a set of nuclear co-dominant loci (named LTRS, WNT, EN, EF, ZMP, AML, ATPS, MP20) assayed for 363 individuals and partitioned into 3 partitions.
summary( arapat )
Species Cluster Population ID Latitude Longitude LTRS WNT
Cape : 75 CBP-C :150 32 : 19 101_10A: 1 Min. :23.08 Min. :-114.3 01:01 :147 03:03 :108
Mainland : 36 NBP-C : 84 75 : 11 101_1A : 1 1st Qu.:24.59 1st Qu.:-113.0 01:02 : 86 01:01 : 82
Peninsula:252 SBP-C : 18 Const : 11 101_2A : 1 Median :26.25 Median :-111.5 02:02 :130 01:03 : 77
SCBP-A: 75 12 : 10 101_3A : 1 Mean :26.25 Mean :-111.7 02:02 : 62
SON-B : 36 153 : 10 101_4A : 1 3rd Qu.:27.53 3rd Qu.:-110.5 03:04 : 8
157 : 10 101_5A : 1 Max. :29.33 Max. :-109.1 (Other): 15
(Other):292 (Other):357 NA's : 11
EN EF ZMP AML ATPS MP20
01:01 :225 01:01 :219 01:01 : 46 08:08 : 51 05:05 :155 05:07 : 64
01:02 : 52 01:02 : 52 01:02 : 51 07:07 : 42 03:03 : 69 07:07 : 53
02:02 : 38 02:02 : 90 02:02 :233 07:08 : 42 09:09 : 66 18:18 : 52
03:03 : 22 NA's : 2 NA's : 33 04:04 : 41 02:02 : 30 05:05 : 48
01:03 : 7 07:09 : 22 07:09 : 14 05:06 : 22
(Other): 16 (Other):142 08:08 : 9 (Other):119
NA's : 3 NA's : 23 (Other): 20 NA's : 5
Create all potential genotypes de novo for a locus with 3 alleles.
At the ATPS locus, there are several genotypes that are observed only once in the entire data set. What are these genotypes and which populations are they found in?
Look at the composition of Populations in the arapat data set with particular attention to Species. This species is a parasite on a limited habit resource. Ecologically, what is happening here?
t
101 102 12 153 156 157 159 160 161 162 163 164 165 166 168 169 171 173 175 177 32 48 51 58 64 73 75 77 84 88 89 9
Cape 0 0 0 0 6 8 0 0 0 0 3 2 0 2 0 0 0 0 0 0 0 10 0 0 0 8 10 1 0 0 0 0
Mainland 9 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 19 0 0 0 0 0 0 0 0 0 0 0
Peninsula 0 0 10 10 0 2 9 10 10 10 7 8 10 8 10 10 10 10 7 10 0 0 7 9 5 2 1 9 9 10 10 9
93 98 Aqu Const ESan Mat SFr
Cape 0 3 4 8 6 4 0
Mainland 0 0 0 0 0 0 0
Peninsula 10 1 4 3 2 1 9
There is only 2 populations for the Cape
species where it occurs in isolation, all the rest have both individuals who have been designated as Cape
and Peninsula
Species co-occurring. This parasite lives on a restricted habitat, and this may be a situation where they in competitive exclusion on the limited habitat.
First I would just grab the populations where they co-occur. I can do this either programatically or just look at the table from the previous question. Then filter the data to just have those populations.
summary( sympatry )
Species Population LTRS WNT EN EF ZMP AML ATPS MP20
Cape :59 75 :11 01:01 :12 02:02 :47 01:01 :54 01:01 :82 01:01 : 3 04:04 :28 02:02 : 2 18:18 :42
Peninsula:48 Const :11 01:02 :21 03:03 :30 01:02 :27 01:02 :10 01:02 : 7 05:05 :18 03:03 :55 05:05 :11
157 :10 02:02 :74 01:01 :11 02:02 :24 02:02 :15 02:02 :90 06:06 :15 03:06 : 2 11:11 :11
163 :10 01:03 : 6 NA's : 2 NA's : 7 07:08 : 9 05:05 :37 05:06 : 6
164 :10 01:02 : 4 08:08 : 9 05:07 : 1 17:17 : 6
166 :10 (Other): 6 (Other):22 08:08 : 9 (Other):29
(Other):45 NA's : 3 NA's : 6 09:09 : 1 NA's : 2
Then I’d split them by species and look at the genetic data
sympatry %>%
filter( Species == "Cape") -> cape
summary( cape )
Species Population LTRS WNT EN EF ZMP AML ATPS MP20
Cape :59 75 :10 01:01 : 2 01:01 : 7 01:01 : 8 01:01 :55 02:02 :54 03:03 : 3 02:02 : 1 17:17 : 6
Peninsula: 0 157 : 8 01:02 : 8 01:02 : 4 01:02 :27 01:02 : 3 NA's : 5 03:04 : 4 03:03 :55 17:18 : 6
73 : 8 02:02 :49 02:02 :47 02:02 :24 02:02 : 1 03:05 : 3 03:06 : 2 18:18 :42
Const : 8 NA's : 1 04:04 :28 09:09 : 1 18:19 : 4
ESan : 6 05:05 :17 19:19 : 1
Aqu : 4 NA's : 4
(Other):15
sympatry %>%
filter( Species == "Peninsula" ) -> peninsula
summary( peninsula )
Species Population LTRS WNT EN EF ZMP AML ATPS MP20
Cape : 0 77 :9 01:01 :10 01:01 : 4 01:01 :46 01:01 :27 01:01 : 3 06:06 :15 02:02 : 1 05:05 :11
Peninsula:48 164 :8 01:02 :13 01:03 : 6 NA's : 2 01:02 : 7 01:02 : 7 07:08 : 9 05:05 :37 11:11 :11
166 :8 02:02 :25 01:04 : 3 02:02 :14 02:02 :36 08:08 : 9 05:07 : 1 05:06 : 6
163 :7 03:03 :30 NA's : 2 07:07 : 6 08:08 : 9 06:06 : 5
Aqu :4 03:04 : 1 06:07 : 2 10:11 : 5
Const :3 04:04 : 2 (Other): 5 (Other): 8
(Other):9 NA's : 2 NA's : 2 NA's : 2
Now, look at the output. If you look at WNT
you can see that the cape group only has alleles 1&2 whereas the peninsula group also has alleles 3 & 4. In fact, there is most of them are 03:03 homozygotes, which do not occur in sympatric cape samples. Here are the populations in which these homozygotes exist.
Repeat with the other loci and you’ll see a lot of evidence that there are private alleles in one species that do not occur in the other.