" !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Weight treated as the class attribute. Identifier deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems. Singapore: Springer-Verlag. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! NAME: fishcatch TYPE: Sample SIZE: 159 observations, 8 variables DESCRIPTIVE ABSTRACT: 159 fishes of 7 species are caught and measured. Altogether there are 8 variables. All the fishes are caught from the same lake (Laengelmavesi) near Tampere in Finland. SOURCES: Brofeldt, Pekka: Bidrag till kaennedom on fiskbestondet i vaera sjoear. Laengelmaevesi. T.H.Jaervi: Finlands Fiskeriet Band 4, Meddelanden utgivna av fiskerifoereningen i Finland. Helsingfors 1917 VARIABLE DESCRIPTIONS: 1 Obs Observation number ranges from 1 to 159 2 Species (Numeric) Code Finnish Swedish English Latin 1 Lahna Braxen Bream Abramis brama 2 Siika Iiden Whitewish Leusiscus idus 3 Saerki Moerten Roach Leuciscus rutilus 4 Parkki Bjoerknan ? Abramis bjrkna 5 Norssi Norssen Smelt Osmerus eperlanus 6 Hauki Jaedda Pike Esox lucius 7 Ahven Abborre Perch Perca fluviatilis 3 Weight Weight of the fish (in grams) 4 Length1 Length from the nose to the beginning of the tail (in cm) 5 Length2 Length from the nose to the notch of the tail (in cm) 6 Length3 Length from the nose to the end of the tail (in cm) 7 Height% Maximal height as % of Length3 8 Width% Maximal width as % of Length3 9 Sex 1 = male 0 = female ___/////___ _ / \\ ___ | /\\ \\_ / / H < ) __) \\ | \\/_\\\\_________/ \\__\\ _ |------- L1 -------| |------- L2 ----------| |------- L3 ------------| Values are aligned and delimited by blanks. Missing values are denoted with NA. There is one data line for each case. SPECIAL NOTES: I have usually calculated Height = Height%*Length3/100 Widht = Widht%*Length3/100 PEDAGOGICAL NOTES: I have mainly used only Species=7 (Perch) and here is some of the models and test, we have used Weight=a+b*(Length3*Height*Width)+epsilon Ho: a=0; Heteroscedastic case. Question: What is proper weighting, if you use Length3 as a weighting variable. Log(Weight)=a+b1*Length3+epsilon Weight^(1/3)=a+b1*Length3+epsilon (Given by Box-Cox-transformation) Ho: a=0; Log(Weight)=a+b1*Length3+b2*Height+b3*Width+epsilon Ho: b1+b2+b3=3; i.e. dimension of the fish = 3 Weight^(1/3)=a+b1*Length3+b2*Height+b3*Width+epsilon (Given by Box-Cox-transformation) Ho: a=0; Weight=a*Length3^b1*Height^b2*Width^b3+epsilon Nonlinear, heteroscedastic case. What is proper weighting? Is obs 143 143 7 840.0 32.5 35.0 37.3 30.8 20.9 0 an outlier? It had in its stomach 6 roach. REFERENCES: Brofeldt, Pekka: Bidrag till kaennedom on fiskbestondet i vaara sjoear. Laengelmaevesi. T.H.Jaervi: Finlands Fiskeriet Band 4, Meddelanden utgivna av fiskerifoereningen i Finland. Helsingfors 1917 SUBMITTED BY: Juha Puranen Departement of statistics PL33 (Aleksanterinkatu 7) 000014 University of Helsinki Finland e-mail: jpuranen@noppa.helsinki.fi " "0" "'fishcatch'" 23.2 24 23.9 26.3 26.5 26.8 26.8 27.6 27.6 28.5 28.4 28.7 29.1 29.4 29.4 30.4 30.4 30.9 31 31.3 31.4 31.5 31.8 31.9 31.8 32 32.7 32.8 33.5 35 35 36.2 37.4 38 23.6 24.1 25.6 28.5 33.7 37.3 12.9 16.5 17.5 18.2 18.6 19 19.1 19.4 20.4 20.5 20.5 21 21.1 22 22 22.1 23.6 24 25 29.5 13.5 14.3 16.3 17.5 18.4 19 19 19.8 21.2 23 24 9.3 10 10.1 10.4 10.7 10.8 11.3 11.3 11.4 11.5 11.7 12.1 13.2 13.8 30 31.7 32.7 34.8 35.5 36 40 40 40.1 42 43.2 44.8 48.3 52 56 56 59 7.5 12.5 13.8 15 15.7 16.2 16.8 17.2 17.8 18.2 19 19 19 19.3 20 20 20 20 20 20.5 20.5 20.7 21 21.5 22 22 22.6 23 23.5 25 25.2 25.4 25.4 25.4 25.9 26.9 27.8 30.5 32 32.5 34 34 34.5 34.6 36.5 36.5 36.6 36.9 37 37 37.1 39 39.8 40.1 40.2 41.1 30 31.2 31.1 33.5 34 34.7 34.5 35 35.1 36.2 36.2 36.2 36.4 37.2 37.2 38.3 38.5 38.6 38.7 39.5 39.2 39.7 40.6 40.5 40.9 40.6 41.5 41.6 42.6 44.1 44 45.3 45.9 46.5 28.7 29.3 30.8 34 39.6 43.5 16.2 20.3 21.2 22.2 22.2 22.8 23.1 23.7 24.7 24.3 25.3 25 25 27.2 26.7 26.8 27.9 29.2 30.6 35 16.5 17.4 19.8 21.3 22.4 23.2 23.2 24.1 25.8 28 29 10.8 11.6 11.6 12 12.4 12.6 13.1 13.1 13.2 13.4 13.5 13.8 15.2 16.2 34.8 37.8 38.8 39.8 40.5 41 45.5 45.5 45.8 48 48.7 51.2 55.1 59.7 64 64 68 8.8 14.7 16 17.2 18.5 19.2 19.4 20.2 20.8 21 22.5 22.5 22.5 22.8 23.5 23.5 23.5 23.5 23.5 24 24 24.2 24.5 25 25.5 25.5 26.2 26.5 27 28 28.7 28.9 28.9 28.9 29.4 30.1 31.6 34 36.5 37.3 39 38.3 39.4 39.3 41.4 41.4 41.3 42.3 42.5 42.4 42.5 44.6 45.2 45.5 46 46.6 38.4 40 39.8 38 36.6 39.2 41.1 36.2 39.9 39.3 39.4 39.7 37.8 40.2 41.5 38.8 38.8 40.5 37.4 38.3 40.8 39.1 38.1 40.1 40 40.3 39.8 40.6 44.5 40.9 41.1 41.4 40.6 37.9 29.2 27.8 28.5 31.6 29.7 28.4 25.6 26.1 26.3 25.3 28 28.4 26.7 25.8 23.5 27.3 27.8 26.2 25.6 27.7 25.9 27.6 25.4 30.4 28 27.1 41.5 37.8 37.4 39.4 39.7 36.8 40.5 40.4 40.1 39.6 39.2 16.1 17 14.9 18.3 16.8 15.7 16.9 16.9 16.7 15.6 18 16.5 18.9 18.1 16 15.1 15.3 15.8 18 15.6 16 15 17 14.5 16 15 16.2 17.9 15 15 15.9 24 24 23.9 26.7 24.8 27.2 26.8 27.9 24.7 24.2 25.3 26.3 25.3 28 26 24 26 25 23.5 24.4 28.3 24.6 21.3 25.1 28.6 25 25.7 24.3 24.3 25.6 29 24.8 24.4 25.2 26.6 25.2 24.1 29.5 28.1 30.8 27.9 27.7 27.5 26.9 26.9 26.9 30.1 28.2 27.6 29.2 26.2 28.7 26.4 27.5 27.4 26.8 13.4 13.8 15.1 13.3 15.1 14.2 15.3 13.4 13.8 13.7 14.1 13.3 12 13.9 15 13.8 13.5 13.3 14.8 14.1 13.7 13.3 15.1 13.8 14.8 15 14.1 14.9 15.5 14.3 14.3 14.9 14.7 13.7 14.8 14.5 15.2 19.3 16.6 15 14 13.9 13.7 14.3 16.1 14.7 14.7 13.9 15.2 14.6 15.1 13.3 15.2 14.1 13.6 15.4 14 15.4 15.6 15.3 14.1 13.3 13.5 13.7 14.7 14.2 14.7 13.1 14.2 14.8 14.6 9.7 10 9.9 11.5 10.3 10.2 9.8 8.9 8.7 10.4 9.4 9.1 13.6 11.6 9.7 11 11.3 10.1 11.3 9.7 9.5 9.8 11.2 10.2 10 10.5 11.2 11.7 9.6 9.6 11 16 13.6 15.2 15.3 15.9 17.3 16.1 15.1 14.6 13.2 15.8 14.7 16.3 15.5 14.5 15 15 15 17 15.1 15.1 15 14.8 14.9 14.6 15 15.9 13.9 15.7 14.8 17.9 15 15 15.8 14.3 15.4 15.1 17.7 17.5 20.9 17.6 17.6 15.9 16.2 18.1 14.5 17.8 16.8 17 17.6 15.6 15.4 16.1 16.3 17.7 16.3 nan nan nan nan nan nan nan nan nan nan nan nan nan 1 nan 1 nan nan nan 1 nan nan nan nan 1 nan nan nan 0 0 nan 1 0 nan nan nan nan nan 0 nan nan nan nan nan nan nan 0 0 0 0 0 nan 0 nan nan 0 nan nan 0 nan nan 1 1 1 nan nan 0 0 nan 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0 nan 0 nan nan nan 1 nan nan nan nan 0 0 nan nan nan 0 0 nan nan nan nan nan nan nan nan nan nan nan nan 1 0 0 nan nan nan 0 0 0 nan nan nan nan nan nan 0 nan nan 0 0 nan 0 nan 0 0 nan nan 0 0 0 0 0 0 nan nan 0 0 0 0 0 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 25 26 26 29 29 29 29 30 30 30 31 31 31 32 32 33 33 33 33 34 34 34 35 35 35 35 36 36 37 38 38 39 41 41 26 26 28 31 36 40 14 18 18 19 20 20 20 21 22 22 22 22 22 24 23 23 25 26 27 31 14 15 17 19 20 20 20 21 23 25 26 9 10 10 11 11 11 11 11 12 12 12 13 14 15 32 34 35 37 38 38 42 42 43 45 46 48 51 56 60 60 63 8 13 15 16 17 18 18 19 19 20 21 21 21 21 22 22 22 22 22 22 22 22 23 23 24 24 24 25 25 26 27 27 27 27 28 28 30 32 34 35 36 36 37 37 39 39 39 40 40 40 40 42 43 43 43 44 242 290 340 363 430 450 500 390 450 500 475 500 500 600 600 700 700 610 650 575 685 620 680 700 725 720 714 850 1000 920 955 925 975 950 270 270 306 540 800 1000 40 69 78 87 120 0 110 120 150 145 160 140 160 169 161 200 180 290 272 390 55 60 90 120 150 140 170 145 200 273 300 6 7 7 9 9 8 10 9 9 12 13 12 19 19 200 300 300 300 430 345 456 510 540 500 567 770 950 1250 1600 1550 1650 5 32 40 51 70 100 78 80 85 85 110 115 125 130 120 120 130 135 110 130 150 145 150 170 225 145 188 180 197 218 300 260 265 250 250 300 320 514 556 840 685 700 700 690 900 650 820 850 900 1015 820 1100 1000 1100 1000 1000 "Species" "Length1" "Length2" "Length3" "Height" "Width" "Sex" "class" "int0" "double1" "int2" "double3" "int4" "nominal:1,2,3,4,5,6,7" "numeric" "numeric" "numeric" "numeric" "numeric" "nominal:1,0" "numeric"