See what "Soft Transfer" is in other dictionaries. Soft hyphenation.

Vlad Merzhevich

Unlike text in printing, hyphenation is rarely used on a web page, since we are not tied tightly to the paper size. Sites can be viewed on different monitors, with different resolutions, in different operating systems  and browsers. All this gives rise to such a combination of combinations that it is impossible to predict how the final text will look for the user. Because of this, usually the text is left-justified, and hyphenation occurs entirely in words. But nevertheless, hyphenation in some cases is needed, for example, when long chemical or medical terms are used, in narrow columns of a given width, for the sake of aesthetics. There are not many manual or automatic methods for adding hyphens in HTML and CSS, so I’ll list everything.

Tag usage

Tag    introduced in HTML5 and creates word wrap if necessary. In those places where, according to the rules of the Russian language, a transfer is permissible, insert    (example 1). If the whole word is placed in the allotted width, this tag will not manifest itself in any way and we will not even know about its presence. If the word does not fit, the browser at the location of the tag    creates a carry.

Example 1. Tag

Transfers

One natsatiklas angelica's dream box after graduation chose a profession production driver nice.



The result of this example is shown in fig. 1.

Fig. 1. Text with word wraps

Soft carry

Application    It has a serious drawback - it’s impossible to understand the hyphenation in front of us or a single word on another line. Because of this, the meaning of the sentence may be lost and it will be misunderstood. Hyphenation should be done according to the rules of typography, namely: add a hyphen at the end of the line. A soft hyphen copes with this perfectly, in the HTML code for it there is a special character -. It performs the same role as the tag.    - is not visible in plain text and wraps the word on another line, while adding a hyphen (example 2).

Example 2. Soft transfer

Transfers

After graduating from high school, Angelica, one of the eleventh class, chose the profession of business-driver-prostitution.



The result of this example is shown in fig. 2. Notice how much more aesthetically and clearly the text began to look in comparison with fig. 1.

Fig. 2. Word wrap text

Word-break property

To automate the process of creating hyphens, use the word-break property with the value break-all (example 3). You don’t have to add any characters or tags to HTML anymore; everyone takes care of the styles.

Example 3. Using word-break

Transfers

Eleventh grader Angelica after graduation chose the profession of clerk.



The result of this example is shown in fig. 3. The rules of hyphenation of the text in this case are not taken into account, so the words can be transferred in a very bizarre way.

Fig. 3. Word wrap text

Of all the listed methods, “semi-manual” using - gives the best result - the rules of the Russian language are followed, the text looks the most aesthetically pleasing. Use it when long words appear in the text.

Hyphens property

And finally, the most powerful and convenient feature for automatically adding hyphens is hyphens. Its action is based on the hyphenation dictionary built into the browser, therefore it gives the best result. Supported in IE10, Firefox, Android and iOS. Chrome and Opera do not support. For this to work, for the tag    add the lang attribute with the value ru (example 4).

Example 4. Using hyphens

Transfers

Eleventh grader Angelica after graduation chose the profession of clerk.



The result of this example is shown in fig. 4.

Fig. 4. Word wrap text

Prohibition of hyphenation

Often the inverse problem arises - to prohibit transfers in those places where they are unacceptable according to the rules of the language. For example, you can’t tear off units of measure from a number (10 ml), designation of a year (54 BC), initials from a surname, break stable abbreviations (etc.), etc. So that the browser does not add hyphens in the space, it should be replaced by a non-breaking space (example 5).

Example 5. Use

Transfers

The lake at coordinates 70 ° 58 ′ 19 ″ s. w. 97 ° 24 ′ 5 ″ c. The village is located in the Taimyr Dolgan-Nenets district of the Krasnoyarsk Territory of Russia.



In this example, for the correct spelling of coordinates, it is used, which does not allow the transfer of text.

Alexey V. Ivanov [file]
  The transference sign is good. But how many words do you know that consist of 40+ characters? I think that the semantic load from the fact that there will be no transfer sign will not decrease.

I agree, this is not a buzz, but is there another, universal solution?

No, there seems to be a consensus among browser developers - they don’t want to let people fill in texts with soft hyphens. What is now - the remains of browser wars, and it is not recommended to use them. Instead, they want to implement automatic hyphenation, but it's not so simple to make it ...

Strange ... And here is my favorite Haali Reader text reader, a small one, 500 kilobytes, perfectly sets hyphens. And I don’t care if she incorrectly transfers some tricky word. She works 99.9% ...

Browser developers are still in more harsh conditions. How many site authors will agree that their site is with some, even small probability, will be shown with stupid or funny grammatical errors? I would not want this even for the "populist" WebWarper, and even for the official website of the company this is completely unacceptable. Unlike, for example, Word, it is absolutely impossible for a web page to predict what size the font will be in relation to the area reserved for the text and, accordingly, where word wrap will work.

And to guarantee the spelling correctness of hyphenation in all cases, and even for all popular languages, and even with the addition of various typographic conventions (which hyphenation should be avoided even if they are formally correct) - I'm afraid this is simply not feasible. Therefore, there is little hope for CSS3 hyphenate, IMHO.

But have there really not yet been text or HTML editors that make it easy to automatically arrange the same - and then see / correct the results? So that in normal mode  typesetting (say, HTML-code) I did not see these hyphens, as I do not see spaces and tabs, but so that you can enable the visibility and editing hyphens mode? It would be very valuable. Nevertheless, typesetting without hyphenation negatively affects readability, especially for languages \u200b\u200bwith long words, such as Russian or German. No wonder that almost all books and magazines use hyphenation.

Starynin Valery [file]
  There are several problems for browsers:

  1. Each language has its own hyphenation rules. That is, all these languages \u200b\u200bmust be supported, and even correctly defined. I look in TeX - for only one German (two spelling variants) the dictionary takes 100 kb or 50 kb in packed form. Still, a solid extra weight is obtained, even if you support only the most common languages \u200b\u200b- browsers should now be small.
  2. TeX works best with hyphenation. But the algorithm that is used there works on entire paragraphs, adding one word can change everything. In conditions of displaying the page as data and dynamic HTML arrive, with this approach the text will “skip”.
  3. Testing functionality that works differently with each language is non-trivial.

For Gecko, this issue was discussed in bug 67715

Daniel Aliyevsky [file]
  CSS3 seems to have provided the ability to specify "exceptions" on the page - words that are not carried according to the rules.

Vladimir Palant [file]  The problem is not so much in the description of "exceptions" - after all, you can always require the non-transfer of certain words and even combinations of words. The problem is to test the large text: make sure that all words are known to be correctly transferred in all situations. How to do this in principle? There are different font sizes, pixel sizes, windows, different options for increasing the font ... Ultimately, almost any word of the text can be transferred. The only way to verify that the browser is transferring everything correctly is a special “debugging” mode of the browser (moreover, of all popular browsers), when all transfers are explicitly marked in some special way. And even then - estimate how complicated the elementary manual verification of even a small site is. Spelling errors are visible with the eye and are often easily found by all sorts of Word-s, and check all potential hyphenations in 200 KB of text ... um.

Alexey Sevryukov [file]: why a bruzer? Now all the browsers “know” that the space, line feed and tabs are one and the same thing: space. That is also a gap. And what prevents them from “knowing” that - - in general, it seems, is not a symbol?

But what about my arguments? How to test hyphenation? After all, mistakes will surely be sometimes. And to check, you need to see with your eyes the position where each of the words can  Be ported at the appropriate font size. How many would like to include this option in CSS if there is no strict guarantee that there are no errors?

An editor that arranges hyphens in the form of special characters, but otherwise displays the text “as is”, can have very advanced tools that manage both the layout itself (say, more or less “strict”) and its testing. Competition is possible - different editors will compete in the reliability factor, there will be national editors who work better with Russian or German, etc.

But, of course, - - it’s “heavy”, any text will double the size. It would be better if it was a reserved ASCII code. We are not confused that the texts are “littered” with spaces, tabs, CR and LF characters. Why not add another control character for hyphens? Which, like tabs and line breaks, is clearly visible only in a special mode of editors?

Ideally, this could be a property not only of HTML and browsers, but in general of any text editing and viewing systems.

Y. Popov aka Jaded [file]  I don’t know about you, but it seems to me that spoiling the text on the page is harmful. If any external (for example, JS solution) will be used and source  pages will be untouched - this is acceptable. It is also interesting how such a text with hyphenation will be copied, it will be necessary to check it at your leisure.
  IMHO.

Alexey Sevryukov [file]  It is clear that if right now and in the forehead to do the layout editor - - it will not be very beautiful. But if all browsers will support, competition will develop, it will become a rule to make transfers on sites - then the overall picture may change. And everything starts to copy normally, and in the source text hyphens will not be visible (in most editors, except for the special mode). Especially if instead of - some new reserved code will be used, at least ASCII 01.

By the way, this is probably the main problem - that the hyphenation is not using a special reserved character (as for tabulation), but - This immediately limits the scope of possible editors - only HTML. And why not a message for alert in JavaScript, for example? Copying becomes more complicated, the editor mode becomes unnatural, “hiding” hyphens, but showing everything else (because & is part of the source code).

Soft carry

Hyphenation

Wide electric
  fication of southern governorates
  niy will give a powerful
  CHOK
  sky economy.

The main function of word wrapping is aesthetic. If hyphenation is not applied, then some lines turn out to be weakly filled (which is especially noticeable when typing narrow columns). In addition, hyphenated text takes up less space.

At the same time, hyphenated text is harder to read, so hyphenation is not used in books for the youngest children.

Carry signs

In most modern European scripts, the word carry sign is graphically identical to the hyphen and is placed after the initial part of the broken word. In ancient fonts (both Latin and Cyrillic) there were more diverse forms of this sign:

  • a horizontal line at the bottom line of the letters (like the underscore _);
  • a line whose right edge is bent upwards;
  • small slash / ;
  • a sign in the form of two oblique dashes (a cross between =   and // ).

In some spelling systems, hyphenation is not indicated by a special sign at all, the word is simply torn between the lines. In particular, the Cyrillic press was dispensed with without a transfer sign until the middle of the 17th century (this tradition is preserved among the Old Believers, for more details see the article “Herok”); these are some modern scripts, mostly Asian (not only hieroglyphic, but also alphabetic, like Thai).

Complicated carryover

In most languages, hyphenation comes down to breaking the word (and adding a hyphen); however, in some words of some languages, the letters themselves or diacritics change during the transfer, for example:

  • english: eigh teen → eigh t-//teen;
  • hungarian: a sszo nnyal → a sz-//szo ny-//nyal;
  • dutch: re ë el → re - // eel om aatje → om a- // tje;
  • greek: Μα ϊ̓ ου → Μα-// ου;
  • catalan: para l·lel → para l-//lel;
  • german: Zu cker → Zu k-//ker, schi ffahrt → Schi ff-//fahrt (according to traditional spelling; in the recently introduced new Zu - // cker and Schifffahrt);
  • swedish: gla ssko → gla s-//sko, gla ss- // ko, gla ss-//sko (depending on the meaning of the word).

Places of allowed transfers

Basically, words can be transferred either along the boundaries of syllables or along the boundaries of morphemes. Each language has its own rules for determining the places of possible transfer (in English this is often indicated in dictionaries; while the British and American systems are fundamentally different).

Implementation in computers

The task of automatically indicating the locations of possible transfers arose as soon as computer technology began to be applied to typesetting and publishing activities (1950s). Systems were used based either on dictionaries in which hyphenation points were indicated for each word, or on algorithms in the form of a set of rules “if you see such a combination of letters, you can (cannot) transfer”. The first approach, especially on the old technology, was inconvenient with the volume of the required databases (and for obvious reason turned out to be unsuitable for previously unknown words), the second (with empirically drawn up rules) for a long time did not give an acceptable quality of work. The situation changed in 1983, when Franklin Mark Liang (Eng. Franklin mark liang), a student of D.E. Knut, proposed an algorithm that, according to the dictionary with hyphenated spaces, builds a compact set of rules that allows you to restore exactly these hyphenation places. As it turned out experimentally, for new words (not contained in the training dictionary), a similar set of rules in the vast majority of cases also finds successful transfer places. Liang's system was originally integrated with a well-known program

Phrase Transfer

Russian spelling does not contain any restrictions on this subject. However, the rules of neat typographical typography prescribe to avoid separation of short (especially single-letter) prepositions and conjunctions from the subsequent text, short particles (primarily b  and well) - from the previous text, etc. It is not recommended to detach a negative particle from the subsequent text not  (for the same reason that it is undesirable to separate such a syllable of a word by hyphenation, see above). You can’t tear apart the reduction like i.e.  or etc., initials between themselves and from the surname, tear off the number from the main word ( Peter I) or units of measure ( 1 km) etc.

It makes a special reservation where punctuation marks should appear during the transfer:

  • opening brackets and quotation marks, as well as an ellipsis at the beginning of a phrase, are adjacent to the subsequent text;
  • other punctuation marks - to the previous text.

Formula Transfer

In the domestic typographic tradition, formulas are allowed to be transferred according to the signs of certain two-place operations (plus, minus, etc., however, it is impossible to transfer by division signs) or relations (equality, inequalities, etc.). In this case, the sign should be repeated on both sides of the gap (in foreign printing systems do not).

You can transfer the formula to the ellipsis (also repeating it at the beginning of a new line), if the ellipsis means the released middle terms of an expression or enumeration: a formula like 1 + 2 + ... + (N − 1) + N   You can transfer by ellipsis, but 1 / 0! + 1 / 1! + 1 / 2! + 1 / 3! + ... = e   - it is impossible (but it is possible by pluses, except the last, and by an equal sign).

In addition, formulas can be broken (without repeating the character) after the enumeration characters, such as commas or semicolons.

There are references to the method of breaking long radical expressions and fractions (with a horizontal line): in this case, the radical expression (or the numerator and denominator of the fraction) is cut according to the usual rules, and the line of the sign of the radical or fraction at the break point is provided with arrows at the end.

Literature

  • Donald E. Knuth. Digital typography. CSLI Lecture Notes, no. 78. Stanford, 1999. ISBN 1-57586-011-2 (hardcover) or ISBN 1-57586-010-4 (paperback).
  • László Németh. Automatic non-standard hyphenation in OpenOffice.org // EuroTeX 2006 Conference Proceedings / TUGboat, 2006, vol. 27, no. 1, pp. 32–37.

Wikimedia Foundation. 2010.

  •   Encyclopedic Dictionary of Psychology and Pedagogy
  • A new line or line feed or line break or line separator or end of line (EOL) character in computer science is a special control character (or sequence thereof) used to complete or separate lines in text data. ... ... Wikipedia

    Floppy disk 3.5 ″ Floppy disk 5.25 ″ Floppy disk device 3.5 ″: 1 cover write protection; 2 base disk with holes for the drive mechanism; 3 protective curtain open area of \u200b\u200bthe housing; ... Wikipedia

    The transfer in typography of a break in a part of a text (words, formulas, etc.), at which its beginning appears on one line and the end on another. Contents 1 Word wrap 1.1 Hyphenation signs 1.2 Complicated hyphenation ... Wikipedia Wikipedia