REFERENTIAL PROPERTIES OF ENGLISH DETACHED NONFINITE CONSTRUCTIONS WITH AN EXPLICIT SUBJECT: OPERATIONALIZATION AND QUANTIFICATION

This article presents the results of quantitative-corpus parameterization of reference properties of English detached nonfinite constructions with an explicit subject, carried out from the perspective of the cognitive-quantitative approach to language study. Through the prism of cognitive-constructive grammar, the syntactic patterns under scrutiny are recognized as grammatical constructions, i.e. complex semiotic units, non-compositional cognitively motivated pairings of form and conceptual meaning/ function, stored as holistic, conceptually connected, and interacting structures. Corpus-quantitative parameterization of referential properties of the given constructions presupposes the analysis of the linguistic means of expressing coreference between five micro-constructions and a corresponding matrix clause, reflected by the factors “Coreference” (COREF) and “Absence of coreference” (ØCOREF) of the parameter “Reference relations” (REFREL). Quantitative verification of the data involves a three-stage quantitative procedure incorporating 1) a multivariate analysis of variance (MANOVA), 2) a one-way analysis of variance (ANOVA), and 3) Tukey’s multiple comparison test performed with a computer statistical data analysis software R. The obtained results prove that the non-augmented constructions show a stronger semantic integration into a matrix clause, compensating lack of syntactic connection by closer reference relations, manifested by explicitly expressed full or partial coreference. The use of augmentors facilitates the identification of with-, without-, despiteand what_withaugmented constructions as syntactic patterns, thus balancing the absence of coreference and ensuring adequate cognitive processing of constructions.


Introduction
The study of complex syntactic patterns and their components belongs to the most topical issues of contemporary grammar, given the significant changes taking place in linguistics under the influence of the recent theoretical frameworks and a tendency to master novel tools and methods of analysis. Linguistic studies are increasingly determined by the so-called "quantitative turn" (Janda, 2013: 2), which results in a paradigmatic shift towards an empirical approach to language analysis. The research tools of state-of-the-art grammar are enriched by the use of usage-based methodology, reference to the data of linguistic corpora, active implementation of quantitative methods, and specialized statistical software.
English detached nonfinite clauses with an explicit subject are specific syntactic patterns characterized by idiosyncratic morphosyntactic, semantic, and functional-pragmatic features.
These syntactic patterns can be illustrated by the following contexts drawn from the BNC-BYU corpus (1-5) (BYU-BNC): ( The patterns under scrutiny represent a nonfinite secondary predication of a syntactically independent configuration. They are part of a minimally biclausal syntactic structure consisting of a matrix clause and a punctuationally or intonationally separated nonfinite clause with its own overt subject. The clauses are of a fixed binary structure [NP XP], where NP represents a secondary subject (Subj), different from the subject of the matrix clause SBJ M , and (XP) is a secondary predicate (Pred), expressed by a nonfinite verb form (NF) (participle I (PI), participle II (PII), infinitive (to-Inf)) or non-verbal part of speech (VL) (noun phrase (NP), adjective phrase (AdjP), adverbial phrase (AdvP) or prepositional phrase (PP)), and connected with a matrix clause through augmentors (aug) (with, without, despite, what with) or asyndetically (øaug). In a sentence, the patterns perform the general syntactic function of an adverbial modifier elaborating, extending, or enhancing the matrix proposition.
Theoretical and methodological background for corpus-quantitative parameterization of a grammatical construction The anthropocentric vector of contemporary linguistic paradigm opens new vistas in the study of language structures as "emergent clusters of lossy memory traces that are aligned within our high-(hyper!) dimensional conceptual space on the basis of shared form, function and contextual dimensions" (Goldberg, 2019: 7) or form-meaning pairs, collectively referred to as constructions (Fillmore (1988); Goldberg (1995Goldberg ( , 2006; Croft (2008); Hilpert (2021)).
The notion of construction has been reintroduced into linguistics by a recent cognitively oriented grammatical theory of construction grammar. A constructional approach to grammar has reinterpreted a conventional linguistic term "construction", giving it a new extended understanding: a construction is recognized as the fundamental unit of language analysis and representation and postulated as a linguistic sign, a pairing of form (the plane of expression) and meaning (the plane of content) (Hoffmann, 2016;Östman, Fried, 2004). As non-compositional, (completely) productive, cognitively entrenched (automated) and complex pairings, constructions are "models for the representation of all grammatical knowledge -syntax, morphology, and lexicon" (Croft, 2008: 463), stored in the construct-i-con, a structured inventory of taxonomic structural networks (Goldberg, Croft, Cruse, 2004;Hoffmann, 2017), and serve as a cognitive-semantic interface to the structures of knowledge (cognitive structures) behind the plane of expression of constructions.
In light of construction grammar framework, English detached nonfinite clauses with an explicit subject are viewed as grammatical constructions, since they instantiate sufficiently frequent cognitively-motivated pairings of form (organization of constituents) and conceptual meaning/ function stored as holistic, conceptually connected, and interacting structures. As a clausal type of constructions, the patterns elaborate the meaning in a way of discourse functions rather than coded semantics: The detached nonfinte with explicit subject (DNFES) constructions constitute a taxonomic constructional network, with every node representing an individual type of a construction. The taxonomic constructional network is organized around a constructional schema, represented by a construction of the highest level of schematicity and abstractness -macro-construction (dtcht-SubjPred NF -cxn). The properties of the macro-construction are inherited by the constructions of a lower level meso-constructions (dtcht-øaug-Subj Pred nf/vl -cnx, dtcht-aug-Subj Pred nf/vl -cnx {AUG: with, what with, without, despite}), further acquired by individual micro-constructions (dtcht-øaug-Subj Pred nf/vl -cxn, dtcht-with-Subj Pred nf/vl -cxn, dtcht-despite-Subj Pred nf/vl -cxn, dtcht-without-Subj Pred nf/vl -cxn, dtcht-what_with-Subj Pred nf/vl -cxn {NF: PI, PII, to-Inf; VL: NP, AdjP, AdvP, PP}) and instantiated in specific realized constructions, i.e., constructs ([his cheeks burning suddenly], [with thick spectacles perched at the end of his nose], [hands in pockets]…).
One of the stages in the study of a grammatical construction presupposes quantitative-corpus parameterization of its lingual properties and involves the development of an appropriate formal model that reflects all possible linguistic parameters of a grammatical construction, collectively forming its linguistic profile, i.e. an organized set of linguistic properties (parameters) of a grammatical construction (morphosyntactic, positional, relational, referential, distributional, functional, collostructional, etc.) presented in a quantitative dimension. Corpus-quantitative parameterization of referential properties of the DNFES-constructions entails the analysis of the linguistic means of expressing coreference relations between the micro-constructions (dtcht-øaug-Subj Pred nf/vl -cxn, dtcht-with-Subj Pred nf/vl -cxn, dtcht-despite-Subj Pred nf/vl -cxn, dtcht-without-Subj Pred nf/vl -cxn, dtcht-what_with-Subj Pred nf/vl -cxn) and a matrix clause, reflected by the factors "Coreference" (COREF) and "Absence of coreference" (ØCOREF) of the parameter "Reference relations" (REFREL). The data for micro-constructions are generalized for the corresponding meso-constructions dtcht-øaug-Subj Pred nf/vl -cxn, dtcht-aug-Subj Pred nf/vl -cxn,, which in turn become the basis for conclusions about the macro-construction (dtcht-Subj Pred NF -cxn).
The frequency of constructions indicates the degree of their entrenchment in a language community and correlates with the number of tokens (constructs) within a corresponding parameter/ factor. Quantitative verification of the data and establishment of statistically significant indicators is carried out utilizing a three-stage quantitative procedure: 1) a multivariate analysis of variance (MANOVA), 2) a one-way analysis of variance (ANOVA) and 3) Tukey's multiple comparison test performed with a computer statistical data analysis software R (R Core Team, 2017).
ANOVA or analysis of variance is a parametric statistical procedure for comparing multiple samples on a metric scale. ANOVA test aims to find dependencies in experimental data by establishing the significance of differences in mean values. In our study, a one-way analysis of ANOVA is utilized to assess the influence of a particular factor level on the given micro-construction. ANOVA calculates statistics F, which reflects the ratio of variance caused by the factor and "random" variance. According to these statistics, the level of significance p is calculated based on which a conclusion about the homogeneity of the sample analyzed is made, that is, the absence of influence of the factor or the opposite. MANOVA or a multidimensional analysis of variance checks for group differences concerning several dependent variables. In the case of statistically significant differences (p <0.05), a posteriori multiple comparisons by Tukey's test (Brezina, 2018) is performed to determine which pairs of micro-constructions differ within the analyzed factor level. The obtained indicators allow to statistically substantiate the linguistic features that determine the functional dynamics and synchronic variability of the constructional network of English detached nonfinite constructions under study and its individual nodes. The study is based on 11,000 constructs selected from the corpus of modern English -the British National Corps (BYU-BNC).
Operationalization and quantification of coreference parameter/ factors By default, the DNFES-constructions are ascribed with an autonomous status, manifested by the absence of full integration into the syntactic structure of the matrix clause (Quirk, et.al. 1985(Quirk, et.al. : 1120. These constructions are detached members of a sentence and do not depend on other sentence members (Tronskiy, 2001: 346); there is no reference correlation between the denotations of the DNFES-constructions and the denotations of the constituents of the matrix clause (Visser, 1972(Visser, : 1259 (or coreference (Kortmann, 1991). The syntactic patterns are independent of the subject of the main sentence, grammatically unrelated to the matrix sentence, and typically there is no visible connection with the matrix clause (Martinčič, 2014: 22).
However, B. Kortmann emphasizes that in most cases there is a certain reference correlation between the denotations of these structures and the denotations of the constituents of the matrix clause, or even the constituents of the surrounding linguistic context (Kortmann, 1991:91). B. Combettes is convinced that in such syntactic patterns at least one of the constituents is connected with the subject or object in the matrix clause by "part-whole" relations (Combettes, 2005). It is the formal or logical non-identity of the subject of [øaug/aug] [Subj NP ][Pred NF ]]-constructions and the subject of the matrix clause that is a necessary condition for distinguishing these syntactic patterns as a separate class of syntactic structures.
Depending on the degree of coreference between the subject of the DNFES-constructions and the constituent of the matrix clause (Kortmann, 1991: 92), three types of reference relations are distinguished: 1) absence of coreference; 2) partial coreference; 3) full coreference.
The absence of coreference is recognized when the subject of the DNFES-constructions is not related to any constituent of the matrix clause or the entire matrix proposition or the immediate context. The logical difference with the matrix subject requires the use of lexical units that do not have an anaphoric function. In such cases, the subject of DNFES-constructions is most often expressed by a common noun or a proper name (names and surnames, names of cities, companies, sports teams, etc.) or expletive pronouns it and this, the whole construction acquiring an idiomatic character (all being well, weather permitting, permission granted, other things being equal) (6-8): ( Partial coreference is observed when the subject of the DNFES-constructions is partially coreferential with the subject of the matrix clause or with some other matrix constituent. Typically, partial coreference implies meronymic relations of two types: the "component -the whole object" ("part of the whole") relations and the "member -assembly/ set" ("member of the whole") relations. These relations are relatively similar because they denote the "whole" consisting of "parts", but in each case the "whole" and "part" are different. In the "part of the whole" relations, the "whole" refers to the body of a human being/ animal, and in the "member of the whole" relations the "whole" is a group, team, company, army, team, etc. On the one hand, in the former type of relations the parts are inseparable from the whole, and in the latter one such relations are impossible. On the other hand, being individual "objects", the entities in the "member of the whole" relations are always discrete, while the entities in the "part of the whole" relations are inherent parts of the "object" (12-13): Partial coreference of the "component -the whole object" type is observed when the subject of the DNFES-constructions is in relations of inalienable possession or pertinence (Fabricius-Hansen, Haug, 2012) with a referent of the matrix subject. Following (Chappell, McGregor, 1996: 4), the relations of inalienable possession are typical to objects closely connected with a person (or a thing), for example, 1) inherently associated objects, such as spatial relations ('front', 'top', 'side'); 2) objects that are an integral part of a person (thing) (e.g., body parts); 3) individuals with a biological or social connection between them, for example, family ties; 4) material objects that are in the inseparable possession of a person.
"Inalienable possession" relations are expressed by possessive or reflexive anaphora, controlled by a matrix subject and/or nouns denoting parts of a structured whole, inalienable things, etc. For example, a part of a human or animal body is naturally "connected" with the body as a holistic, discrete unit of reality (14-15): The "member -assembly/ set" relations are displayed when the antecedent of the subject of the DNFES-constructions in the matrix clause denotes several referents (a plural entity, a group of individuals), and the subject of the construction is expressed by an inclusive pronoun (all, every, each, either, etc.) The pronominal subject in the DNFES-constructions refers to these referents as a whole (both) or each member individually (each). If the construction precedes the matrix clause, then the subject of the matrix clause is expressed by the pronoun each ( Thus, based on the results of the operationalization, the parameter "Reference relations" (REFREL) for the DNFES-constructions is represented by the factors "Coreference" (COREF) and "Absence of coreference" (ØCOREF). The factor "Coreference" is manifested on the levels of "Full coreference" (COREFFull) and "Partial coreference" (COREFPart). The level of the "Absence of coreference" factor (ØCOREF) coincides with the factor itself. The quantitative analysis of the implementation of the factors COREF and ØCOREF has shown significant differences in coreference relations between types of constructions and their matrix clauses. The results are given in Table 1 (the total number of constructs in the sample N = 11 000 is taken as 100%).
As can be seen from Table1, the non-augmented micro-construction dtcht-øaug-Sub-jPred nf/vl -cxn is recorded in the largest number of contexts where it exhibits co-referential relations with the matrix clause. This is evidenced by a higher quantity of constructs with full and with partially coreferential links in comparison with contexts where the absence of coreference is observed. In addition, this micro-construction shows the highest rate of full coreference links with the matrix in the investigated sample.
A similar situation is observed in the linguisitc profiles of meso-constructions: unaugmented dtcht-øaug-SubjPred nf/vl -cxn shows a greater degree of coreference with the matrix clause in comparison with the augmented dtcht-aug-SubjPred nf/vl -cnx meso-construction.
In the linguistic profile of the macro-construction dtcht-SubjPred NF -cxn we record a significant quantitative domineering of noncoreference relations (62.54% of the total sample size) over fully/ partially co-reference (2.11% and 35.35%, respectively).
At the next stage of the research, a one-way analysis of variance (ANOVA) is employed to check whether the established quantitative differences in the realization of referential relations are statistically significant for distinguishing micro-constructions from each other. The results obtained do not demonstrate statistically significant differences (Pr(>F)=0.2090>0.05) between the micro-constructions (dtcht-øaug-SubjPred nf/vl -cxn, dtcht-with-SubjPred nf/vl -cxn, dtcht-despite-SubjPred nf/vl -cxn, dtcht-without-SubjPred nf/vl -cxn, dtcht-what_with-SubjPred nf/vl -cxn) in realization of the relationship of full coreference (COREFFull) with the matrix clause. Both augmented and non-augmented micro-constructions instantiate the smallest number of such relations in the sample and are considered statistically homogeneous in this respect.
The analysis of the factor "Coreference" (COREF) implementation at the level of "Partial coreference" by Tukey multiple comparison method reveals that the manifestation of semantic relations of partial coreference with the matrix clause is a statistically significant indicator of distinguishing the linguistic profiles of six pair of constructions (dtcht-with-SubjPred nf/vl -cxn vs dtcht-despite-SubjPred nf/vl -cxn; dtcht-øaug-SubjPred nf/vl -cxn vs dtcht-despite-SubjPred nf/vl -cxn; dtcht-with-SubjPred nf/vl -cxn vs dtcht-what_with-SubjPred nf/vl -cxn; dtcht-øaug-SubjPred nf/vl -cxn vs dtcht-what_with-SubjPred nf/vl -cxn; dtcht-without-SubjPred nf/vl -cxn vs dtcht-with-SubjPred nf/ vl -cxn vs dtcht-without-SubjPred nf/vl -cxn і dtcht-øaug-SubjPred nf/vl -cxn.The obtained indicators show that with-augmented and non-augmented dtcht-øaug-SubjPred nf/vl -cxn micro-constructions show the same tendency to be used in the contexts where they realize partially co-referential connections with a matrix clause but differ in this indicator from other constructions. However, without-, despite-and what_with -augmented micro-constructions are also not differentiated by this factor, i.e. they demonstrate the same potential to actualize partially coreference relations with a matrix clause. "Absence of coreference" (ØCOREF) is a distinctive feature for the micro-constructions dtcht-with-SubjPred nf/vl -cxn and dtcht-despite-SubjPred nf/vl -cxn; dtcht-with-SubjPred nf/vl -cxn and dtcht-what_with-SubjPred nf/vl -cxn; dtcht-without-SubjPred nf/vl -cxn and dtcht-with-Sub-jPred nf/vl -cxn, indicating that the with-augmented construction differs from other augmented micro-constructions by the predominant implementation of non-referential links with a matrix clause. The other micro-constructions are homogeneous in this aspect, i.e. they show the quantitative differences between them are not statistically significant. The pair of micro-constructions dtcht-øaug-SubjPred nf/vl -cxn and dtcht-with-SubjPred nf/vl -cxn deserves special attention, for which at the 95% level of confidence statistically significant differences are not recorded, but the obtained p-value 0.0754513 is a little higher than a critical value of 0.05. Therefore, the absence of coreference between these micro-constructions and their matrix clauses can be recognized as a factor that distinguishes these two patterns, however, with a lower level of confidence. The diff index is negative, which indicates the predominance of constructs with absent coreference in the linguistic profile of the dtcht-with-SubjPred nf/vl -cxn micro-construction over dtcht-øaug-SubjPred nf/vl -cxn.

Conclusions
The results of the conducted corpus-quantitative parameterization refute the traditional opinion of grammarians about the completely autonomous status of the English DNFES-constructions and the absence of reference relations between them and corresponding matrix clauses. The obtained data show that the cases of coreference (complete and partial) between the constituent of the matrix sentence and the subject of a DNFES-construction are relatively frequent, but are mainly registered in the linguistic profile of the dtcht-øaug-SubjPred nf/vl -cxn micro-construction. The non-augmented micro-construction shows a stronger semantic integration into the matrix clause, compensating lack of syntactic connection with a matrix by closer reference relations with it, which are manifested in the form of explicitly expressed full or partial coreference. Coreference relations (along with the corresponding determinant of the subject of the construction) provide adequate cognitive processing of this syntactic pattern. The use of augmentors, the list of which is limited in modern English (with, without, despite, what with), facilitates the identification of a DNFES-construction as a syntactic structure, thus balancing the lack of coreference. The application of a three-step corpus-quantitative procedure statistically verifies the determinant coreference factors for each micro-construction and provides a comprehensive linguistic-quantitative characterization of factors/ factor levels that determine a speaker's choice of a particular DNFES-construction.
The findings presented in this paper point to the need of further research. Obviously, additional studies of the investigated syntactic patterns incorporating constructional approach with methods and tools of quantitative corpus linguistics will be of considerable interest. The next stage of our research will be to validate the suggested computerized linguo-quantitative procedure to investigate other linguistic parameters (positional, distributional, collostructional, etc.) of the grammatical constrictions under study and statistically verify the determining parameters (factors), conditioning functional dynamics and variability of the network of detached nonfinite constructions with an explicit subject in present-day English.