Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
2011-kl-lab7.doc
Скачиваний:
9
Добавлен:
12.11.2019
Размер:
871.42 Кб
Скачать

Інтернет посилання

http://www.nltk.org

http://python.org

ДОДАТОК А

Language Log

December 29, 2005

Asbestos she can

A few days ago, Nathan Bierma asked me (by email) whether the construction exemplified by "as best (as) I can" might be a blend of "the best (that) I can" and "as well as I can". The puzzle is why we say "as best (as) I can", but not "as hardest (as) I can", or indeed "as ___ (as) I can" for any other superlative.

Whatever the exact history, "as best <SUBJ> <MODAL>" is an old pattern. For instance, an anonymous drama from 1634, "The Mirror of New Reformation", has the lines

... I wil straight dispose, as best I can, th'inferiour Magistrate ...

And in "The Taming of the Shrew" (1594), Shakespeare has Petruchio say

And I haue thrust my selfe into this maze, Happily to wiue and thriue, as best I may ...

The pattern "as best as" seems to be more recent. The earlier citation I could find was from 1856, in "Night and Morning" (a play adapted from the novel by Bulwer-Lytton), where Gawtry says:

In fine, my life is that of a great schoolboy, getting into scrapes for the fun of it, and fighting my way out as best as I can!

It continues to be used by reputable authors, as in William Carlos Williams' poem 1917 poem "Sympathetic Portrait of a Child":

As best as she can she hides herself in the full sunlight

But whatever the origins and history of the construction, Nathan's suggestion might have something to do with the forces that keep it in current use. So I thought I'd look at some current web counts; and since different search engines sometimes give counts that differ in random-seeming ways, I tried MSN, Yahoo and Google. I started by looking at the patterns "as best __ can" and "as best as __ can", across the different pronouns. I might still discover something relevant to Nathan's question, but along the way I stumbled on a strange pattern in the web search count, which I'll share with you now.

[MSN]

I

you

he

she

it

we

they

[total]

as best __ can

183,672

152,044

31,785

11,353

28,837

217,952

98,167

 

as best as __ can

74,551

35,688

4,869

1,938

6,133

33,812

12,724

 

best/best as ratio

2.4

4.3

6.5

5.9

4.7

6.4

7.7

4.3

 

[Yahoo]

I

you

he

she

it

we

they

[total]

as best __ can

1,070,000

659,000

210,000

80,300

114,000

853,000

495,000

 

as best as __ can

438,000

148,000

33,400

2,800

30,300

148,000

67,500

 

best/best as ratio

2.4

4.5

6.3

28.7

3.8

5.8

7.3

4.0

Helpful Yahoo asks "Did you mean 'asbestos they can'?", although the suggested substitution gets only 95 yits compared to 67,500 for "as best as they can", andYahoo doesn't make any such suggestion for any of the other pronouns in this pattern.

[Google]

I

you

he

she

it

we

they

[total]

as best __ can

830,000

466,000

132,000

51,000

95,100

667,000

377,000

 

as best as __ can

320,000

102,000

21,600

851

22,400

114,000

49,300

 

best/best as ratio

2.6

4.6

6.1

60.0

4.2

5.9

7.6

4.2

In this case, the (proportional) counts are generally pretty consistent across the search engines:

However, there's something funny going on with "she", as we can see better if we display the proportions on a log scale:

The oddity is even clearer if we plot the best/best as ratios:

Google and Yahoo have many fewer hits for the string "as best as she can" than they ought to, in proportion to their counts "as best she can" and their counts for other pronouns in both patterns. What could be going on?

If all three search engines showed the same deficit, we might explore the idea that this is telling us something about our culture's thought and language. But they don't, and so I strongly suspect that instead this is showing us something about the algorithms that Google and Yahoo use to prune SEO-blackhat web pages.

For linguistic as well as algorithmic comparison, here are the analogous numbers and pictures for the pattern "the best (that) __ can":

[MSN]

I

you

he

she

it

we

they

[total]

the best __ can

462,164

659,558

65,128

23,822

284,798

508,639

277,363

 

the best that __ can

64,998

60,047

7,715

2,812

36,476

52,614

43,164

 

best/best as ratio

7.1

11.0

8.4

8.5

7.8

9.7

6.4

8.5

This time, by the way, helpful MSN asks "Were you looking for 'the beast that we can'?"

[Yahoo]

I

you

he

she

it

we

they

[total]

the best __ can

2,940,000

3,180,000

422,000

183,000

1,050,000

2,700,000

1,350,000

 

the best that __ can

343,000

267,000

47,300

4,240

127,000

244,000

168,000

 

best/best as ratio

8.6

11.9

8.9

43.2

8.3

11.1

8.0

9.8

[Google]

I

you

he

she

it

we

they

[total]

the best __ can

1,830,000

1,700,000

280,000

93,100

795,000

1,660,000

1,280,000

 

the best that __ can

225,000

175,000

28,600

12,600

75,100

161,000

126,000

 

best/best as ratio

8.1

9.7

9.8

7.3

10.6

10.3

10.2

9.5

Again, the (proportional) counts are generally pretty consistent across the search engines:

But again, there's something funny going on with "she", though this time it only shows up in Yahoo's counts:

I remain puzzled about what is really behind this -- maybe something about the typical language of porn site link nests? My interest in reverse engineering search engines is not great enough to motivate me to spend much more time investigating it. But if you know, or have a good guess, tell me and I'll tell the world.

Романюк Андрій Богданович, Юрчак Ірина Юріївна

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]