下载此文档

assumption lean regression richard berk参考-匠人.pdf

文档分类：医学/心理学 | 页数：约29页举报非法文档有奖

1/29

下载提示

1.该资料是网友上传的，本站提供全文预览，预览什么样，下载就什么样。
2.下载该文档所得收入归上传者、原创者。
3.下载的文档，不会出现我们的网址水印。

同意并开始全文预览

(约 1-6 秒)

1/29 下载此文档

文档列表 文档介绍

该【assumption lean regression richard berk参考-匠人】是由【熙凤】上传分享，文档一共【29】页，该文档可以免费在线阅读，需要了解更多关于【assumption lean regression richard berk参考-匠人】的内容，可以使用淘豆网的站内搜索功能，选择自己适合的文档，以下文字是截取该文章内的部分文字，如需要获得完整电子版，请下载此文档到您的设备，方便您编辑和打印。:..TheAmericanStatisticianISSN:0003-1305(Print)1537-2731(Online)Journalhomepage:https:///utas20AssumptionLeanRegressionRichardBerk,AndreasBuja,LawrenceBrown,e,ArunKumarKuchibhotla,WeijieSu&LindaShazoTocitethisarticle:RichardBerk,AndreasBuja,LawrenceBrown,e,ArunKumarKuchibhotla,WeijieSu&LindaShazo(2019):AssumptionLeanRegression,TheAmericanStatistician,DOI:.1592781Tolinktothisarticle:https:///.1592781Acceptedauthorversionpostedonline::10ViewCrossmarkdataFullTerms&essandusecanbefoundathttps://ion/journalInformation?journalCode=utas20:..AssumptionLeanRegressionRichardBerk1,2,AndreasBuja2,LawrenceBrown2,e2,ArunKumarKuchibhotla2,WeijieSu2,andLindaShazo21DepartmentofCriminology,UniversityofPennsylvania2DepartmentofStatistics,******@–Itiswellknownthatwithobservationaldata,,’,however,,,,inferenceshouldbebasedonsandwichestimatorsorthepairs(x-y),,manyofwhichareeffectivelyuntestable(Box,1976,Leamer,1878;Rubin,1986;AcceptedManuscriptCox,1995;Berk,2003;Freedman,2004;2009).Wediscussheresomeimplicationsofan“assumptionlean”,onerequiresonlythattheobservationsareiid,,theparametersoffittedmodels:..needtobeinterpretedasstatisticalfunctionals,herecalled“regressionfunctionals.”Foreaseandclarityofexposition,,.(2018a;b),aportionofwhichdrawsonearlyinsightsofHalbertWhite(1980).2TheParentJointProbabilityDistributionForobservationaldata,supposethereisasetofreal-valuedrandomvariablesthathaveajointdistributionP,alsocalledthe“population,”thatcharacterizesregressorvariablesXX,,?,theregressorvariablesarenotinterpretedasfixed;(1)1p??columnrandomvectorX???(1,,),fortheconditionalPP?Y,XPY|XdistributionofYgivenX,|XAcceptedManuscriptHence,,theregressorsbeingrandomvariables,:..AsafeatureofPor,moreprecisely,of,thereisa“trueresponsesurface”PY|Xdenotedby?(),?()XistheconditionalexpectationofYgivenX,?()[|]XEX?Y,butthereareotherpossibilities,,?(),wewillmakeuse,forPY|Xexample,ofstandardordinaryleastsquares(OLS),butinlatersections,;deviationsfromlinearityin?()Xmaybedifficulttodetectwithdiagnostics,orthelinearfitisknowntobeadeficientapproximationof?()Xandyet,OLSisemployedbecauseofsubstantivetheories,measurementscales,()X??βXtoYwithOLScanberepresentedmathematicallyatthepopulationPwithoutassumingthattheresponsesurface?()XislinearinX:(1)2βPEβX()argmin().???β?p?1[]YThevectorββP?()isthe“populationOLSsolution”andcontainsthe“populationcoefficients.”Notationally,whenwewriteβ,itisunderstoodtobeβP().Similartofinitedatasets,eptedManuscriptobtainedbysolvingapopulationversionofthenormalequations,resultinginβPEXXEX()[][].???1(2)Y:..Thus,oneobtainsthebestlinearapproximationtoYaswellasto?(),itcanbeusefulwithout(unrealistically)assumingthat?()XisidenticaltoβX?.Wehaveworkedsofarwithadistribution/populationP,,therefore,definedatargetofestimation:βP()obtainedfrom(1)and(2)-definedaslongasthejointdistributionPhassecondmomentsandtheregressordistributionisnotperfectlycollinear;thatis,thesecondmomentPXmatrixEXX[]??()“assumptionlean”or“modelrobust”?Indeed,thosewhoinsistthatmodelsmustalwaysbe“correctlyspecified”.“Improving”modelsbysearchingregressors,tryingouttransformationsofallvariables,inventingnewregressorsfromexistingones,usingmodelselectionalgorithms,performinginteractiveexperiments,,butAcceptedManuscriptfitthemtoowell().Researchisunderwaytoprovidevalidpost-selectioninference(.,,),,,however,indicatethat:..extensionsofBerketal.(2013)haveasymptoticjustificationsundermisspecification(,).Beyondthecostsofdatadredging,therecanbesubstantivereasonsfordiscouraging“modelimprovement.”Somevariablesmayexpressphenomenain“natural”or“conventional”,-,,,inBujaetal.(2018b)’smaximthatmodelsarealways“wrong”,therefore,isadiscussionofsomeoftheseconsequencesandanargumentinfavorofassumptionleaninferenceemployingmodelrobuststandarderrors,suchasthoseobtainedfromsandwichestimatorsorthex-?()XandβX?.:..showsthetrueresponsesurface?()??01?,*|?*linearapproximationisdenotedas??????yx()01andwillbecalledthe*“populationresidual.”Thevalueofδatx*posedintotwocomponents:?ponentresultsfromthedisparitybetweenthetrueresponsesurface,*,andtheapproximationβ0+β1x*.Wedenotethisdisparity?()xbyη=η(x*)andcallit“thenonlinearity.”Becauseβ0+β1x*isanapproximation,,thenonlinearityη(X)isarandomvariableaswell.?ponentofδatx*,denotedbyε,israndomvariationaroundthetrueconditionalmeanμ(x*).Wepreferforsuchvariationtheterm“noise”over“error.”Sometimesitiscalled“irreduciblevariation”,inwhichcasewewrite????YβX,?????()XβXand????Y(),butthesearenotassumptions,rather,theyareAcceptedManuscriptconsequencesofthedefinitionsthatconstitutetheaboveOLS-,thenonlinearityandthenoiseareall“population-orthogonal”totheregressors:EEXE()(())()??????(3)jjj:..Aswasalreadynoted,theseproperties(3)?isthepopulationOLSapproximationofYandalsoof?()(X0=1),thefacts(3)implythatallthreetermsaremarginallypopulationcentered:EEXE[][()][]0.??????(4)However,[|]()???,which,thoughmarginallycentered,isafunctionofXandhence,notindependentoftheregressors(unlessitvanishes).parison,thenoiseεismarginallyandconditionallycentered,EX[|]0??,butnotassumedhomoskedastic,andhence,,“errorterms”,,monstatisticalpracticethatsuchregressorsaretreatedasfixed(Searle,1970:Chapter3).Inprobabilisticterms,:..frequentistparadigm,alternativedatasetsgeneratedfromthesamemodelleaveregressorvaluesunchanged;,regressionmodelshavenothingtosayabouttheregressordistribution;theyonlymode

assumption lean regression richard berk参考-匠人来自淘豆网www.taodocs.com转载请标明出处.