<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>standard library Archives | Programming Zen</title>
	<atom:link href="https://programmingzen.com/tag/standard-library/feed/" rel="self" type="application/rss+xml" />
	<link>https://programmingzen.com/tag/standard-library/</link>
	<description>Meditations on programming, startups, and technology</description>
	<lastBuildDate>Fri, 11 Oct 2019 16:06:56 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
<site xmlns="com-wordpress:feed-additions:1">1397766</site>	<item>
		<title>String Length in Elixir</title>
		<link>https://programmingzen.com/string-length-in-elixir/</link>
					<comments>https://programmingzen.com/string-length-in-elixir/#comments</comments>
		
		<dc:creator><![CDATA[Antonio Cangiano]]></dc:creator>
		<pubDate>Mon, 14 Oct 2019 12:00:11 +0000</pubDate>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[elixir]]></category>
		<category><![CDATA[elixir-cookbook]]></category>
		<category><![CDATA[standard library]]></category>
		<category><![CDATA[strings]]></category>
		<category><![CDATA[unicode]]></category>
		<guid isPermaLink="false">https://programmingzen.com/?p=2286</guid>

					<description><![CDATA[<p>In a previous post, I wrote about Hello World in Elixir. Using such a simple program allowed me to discuss a few concepts about the language. This post explores strings further, by discussing how to find the length of a string in Elixir. Simple enough, but there is more than meets the eye. Elixir String Length In Elixir, you can return the number of characters in a string with the String.length/1 function: String.length(&#34;Antonio&#34;) # 7 String.length(&#34;&#34;) # 0 String.length(&#34;r&#233;sum&#233;&#34;) # 6 Discussion In the case of a string like, &#34;Antonio&#34;, there isn&#8217;t much to discuss. String.length(&#34;Antonio&#34;) returns the number of </p>
<p>The post <a href="https://programmingzen.com/string-length-in-elixir/">String Length in Elixir</a> appeared first on <a href="https://programmingzen.com">Programming Zen</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>In a previous post, I wrote about <a href="https://programmingzen.com/elixir-hello-world/">Hello World in Elixir</a>. Using such a simple program allowed me to discuss a few concepts about the language. This post explores strings further, by discussing how to find the <strong>length of a string in Elixir</strong>.</p>



<p>Simple enough, but there is more than meets the eye.</p>



<h2 class="wp-block-heading">Elixir String Length</h2>



<p>In Elixir, you can return the number of characters in a string with the <code>String.length/1</code> function:</p>



<pre class="wp-block-code"><code class="prettyprinted">String.length("Antonio") # 7
String.length("")        # 0
String.length("résumé")  # 6</code></pre>



<h2 class="wp-block-heading">Discussion</h2>



<p>In the case of a string like, <code>"Antonio"</code>, there isn&#8217;t much to discuss. <code>String.length("Antonio")</code> returns the number of characters in the string.</p>



<p>The string clearly has 7 characters and since each character in this particular string can be represented with a single byte, its raw representation is 7 bytes as well.</p>



<p>Things become more interesting when a string contains special characters. </p>



<p>Consider the string, <code>"résumé"</code>. In this case, <code>String.length/1</code> returns 6. You can think of this as the number of &#8220;visible&#8221; or &#8220;user-perceived&#8221; characters in the string. However, let&#8217;s investigate its raw representation:</p>



<pre class="wp-block-code"><code class="prettyprinted">iex(2)&gt; i "résumé"
Term
  "résumé"
Data type
  BitString
Byte size
  8
Description
  This is a string: a UTF-8 encoded binary. It's printed surrounded by
  "double quotes" because all UTF-8 encoded code points in it are printable.
Raw representation
  &lt;&lt;114, 195, 169, 115, 117, 109, 195, 169&gt;&gt;
Reference modules
  String, :binary
Implemented protocols
  Collectable, IEx.Info, Inspect, List.Chars, String.Chars</code></pre>



<p>You&#8217;ll notice that the string&#8217;s actual data type is BitString. Specifically, this binary is made up of 8 bytes, even though there are only 6 user-perceived characters. This is because it takes 2 bytes to represent its accented é characters.</p>



<p>If you are interested in the number of bytes within a string, instead of its length, you can use the <code>Kernel.byte_size/1</code> function:</p>



<pre class="wp-block-code"><code class="prettyprinted">iex(3)&gt; string = "résumé"
"résumé"

iex(4)&gt; String.length(string)
6

iex(5)&gt; byte_size(string)
8</code></pre>



<p>From a performance standpoint, it&#8217;s worth noting that <code>Kernel.byte_size/1</code> is more efficient than <code>String.length/1</code>. Unlike the latter, which takes longer as the string grows, <code>Kernel.byte_size/1</code> will return in constant time.</p>



<p><a href="https://hexdocs.pm/elixir/String.html">Strings</a> in Elixir are UTF-8 encoded binaries. You can think of them as collections of code points. A code point is a Unicode character, whose underlying representation might require one or more bytes. For example, the é characters in the string <code>"résumé"</code> are code points whose representation requires two bytes each.</p>



<p>The Unicode standard also defines some special characters as the combination of other characters. In other words, even though they appear to the reader as a single character, they are in fact a combination of two or more code points. These are known as grapheme clusters.</p>



<p>For example, the e-acute letter can be represented as a single code point as we&#8217;ve done so far (this is also known as a precomposed character) or as a combination of two code points (the letter e and a combining acute accent). These look the same but they are technically two different characters as far as Elixir is concerned; so the two strings below end up representing two different binaries:</p>



<pre><code class="prettyprinted">iex(6)&gt; "é" == "e&#769;"
false</code></pre>



<p>When working with strings, you&#8217;ll often want to consider them in terms of the user-perceived characters, rather than their code points or the binary they actually represent.</p>



<p>To help us out, the Elixir <code>String</code> module provides us with <code>String.graphemes/1</code> which returns a list of characters, without splitting grapheme clusters into the underlying code points. If you need the codepoints, you can always use <code>String.codepoints/1</code>.</p>



<p>To see the distinction between code points and graphemes in action, consider the following string (using the grapheme cluster built from two code points):</p>



<pre class="wp-block-code"><code class="prettyprinted">iex(7)&gt; String.codepoints("cliche&#769;")
["c", "l", "i", "c", "h", "e", "&#769;"]

iex(8)&gt; String.graphemes("cliche&#769;")
["c", "l", "i", "c", "h", "e&#769;"]</code></pre>



<p>As you can see, <code>String.codepoints/1</code> shows us a list of code points in the string, and the special e-acute gets split into two code points. If you look closely, you&#8217;ll notice the accent as the last code point in the list.</p>



<p><code>String.graphemes/1</code> simply returns the list of graphemes and is the closest thing that we have to a function that provides a list of user-perceived characters.</p>



<p>Now, consider its length and byte size:</p>



<pre class="wp-block-code"><code class="prettyprinted">ie(9)&gt; String.length("cliche&#769;")
6

iex(10)&gt; byte_size("cliche&#769;")
8</code></pre>



<p><code>String.length/1</code> returns 6, the number of user-perceived characters in the string. <code>Kernel.byte_size/1</code> returns 8 because it takes 3 bytes to represent the special character/grapheme (1 for the letter e, and 2 for the combining acute accent).</p>



<p>We used the expression &#8220;visible&#8221; or &#8220;user-perceived&#8221; characters to give us an intuitive understanding.  Now that you know more about bytes, code points, and graphemes, we can be a little more precise and say that <code>String.length/1</code> returns the number of graphemes in a string.</p>



<p>Finally, if you wanted to know the number of codepoints, you could trivially compose the <code>Kernel.length/1</code> function and the <code>String.codepoints/1</code> function:</p>



<pre class="wp-block-code"><code class="prettyprinted">"cliche&#769;"
  |&gt; String.codepoints()
  |&gt; length()</code></pre>



<p>Note that if you are trying this in IEx, you&#8217;ll need to use the backslash character to continue on new lines:</p>



<pre class="wp-block-code"><code>iex(11)> "cliche&#769;" \
...(11)> |> String.codepoints() \
...(11)> |> length()
7</code></pre>



<p>So in summary, our string contains 6 graphemes, 7 code points, and 8 bytes.</p>
<p>The post <a href="https://programmingzen.com/string-length-in-elixir/">String Length in Elixir</a> appeared first on <a href="https://programmingzen.com">Programming Zen</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://programmingzen.com/string-length-in-elixir/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2286</post-id>	</item>
		<item>
		<title>Elixir Hello World</title>
		<link>https://programmingzen.com/elixir-hello-world/</link>
					<comments>https://programmingzen.com/elixir-hello-world/#respond</comments>
		
		<dc:creator><![CDATA[Antonio Cangiano]]></dc:creator>
		<pubDate>Fri, 11 Oct 2019 04:49:04 +0000</pubDate>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[elixir]]></category>
		<category><![CDATA[elixir-cookbook]]></category>
		<category><![CDATA[erlang]]></category>
		<category><![CDATA[puts]]></category>
		<category><![CDATA[standard library]]></category>
		<category><![CDATA[strings]]></category>
		<guid isPermaLink="false">https://programmingzen.com/?p=2268</guid>

					<description><![CDATA[<p>It is customary to start programming language tutorials with Hello World programs. So today I&#x2019;m sharing with you a Hello World in Elixir, one of my favorite programming languages (along with Ruby and Python, of course). As you likely know, a Hello World is a very simple program that displays the phrase, Hello, World! Our Elixir Hello World might not be too exciting but it will allow us to discuss quite a few fundamental concepts. Hello World in Elixir As you might expect, this is very straightforward: How to run it The easiest way to run this line of code </p>
<p>The post <a href="https://programmingzen.com/elixir-hello-world/">Elixir Hello World</a> appeared first on <a href="https://programmingzen.com">Programming Zen</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>It is customary to start programming language tutorials with Hello World programs. So today I&#8217;m sharing with you a <strong>Hello World in Elixir</strong>, one of my favorite programming languages (along with Ruby and Python, of course).</p>



<p>As you likely know, a Hello World is a very simple program that displays the phrase, <code>Hello, World!</code></p>



<p>Our <strong>Elixir Hello World</strong> might not be too exciting but it will allow us to discuss quite a few fundamental concepts.</p>



<h2 class="wp-block-heading">Hello World in Elixir</h2>



<p>As you might expect, this is very straightforward:</p>



<pre class="wp-block-code"><code>IO.puts("Hello, World!")</code></pre>



<h2 class="wp-block-heading">How to run it</h2>



<p>The easiest way to run this line of code is to launch IEx (i.e., Interactive Elixir). This will also act as a sanity test to ensure that you have installed Elixir correctly on your machine.</p>



<pre class="wp-block-code"><code>$ iex
Erlang/OTP 22 [erts-10.4.4] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [hipe] [dtrace]

Interactive Elixir (1.9.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> </code></pre>



<p>For the best experience on Windows, including tab-based autocompletion in IEx, it&#8217;s worth passing the <code>--werl</code> flag to IEx or permanently enabling it by setting the environment variable <code>IEX_WITH_WERL</code> to <code>true</code>.</p>



<pre class="wp-block-code"><code>C:\> iex --werl</code></pre>



<p>Once you type your Elixir Hello World and press enter, IEx will print out the message <code>Hello, World!</code> and then display the return value of the function, which is the atom <code>:ok</code><sup>1</sup> (indicating that the function executed successfully):</p>



<pre class="wp-block-code"><code>iex(2)> IO.puts("Hello, World!")
Hello, World!
:ok
iex(3)></code></pre>



<p>Alternatively, you could place our one-liner in a <code>hello.exs</code> file and execute it as a script:</p>



<pre class="wp-block-code"><code>$ elixir hello.exs
Hello, World!</code></pre>



<p>You could also create a whole (Mix) project for this, but arguably that is overkill for our humble Hello World. </p>



<h2 class="wp-block-heading">Discussion</h2>



<p><code>puts</code> is a function that prints a message and adds a newline character. <code>write</code> is the variant that does the same but doesn&#8217;t append a newline. They both belong to the <code>IO</code> module, a module that includes, as the name implies, various functions for handling input and output.</p>



<p>At any time, we can learn more about a function by using the <code>h</code> helper within IEx:</p>



<pre class="wp-block-code"><code>iex(3)> h IO.puts

                        def puts(device \\ :stdio, item)

  @spec puts(device(), chardata() | String.Chars.t()) :: :ok

Writes item to the given device, similar to write/2, but adds a newline at the
end.

By default, the device is the standard output. It returns :ok if it succeeds.

## Examples

    IO.puts("Hello World!")
    #=> Hello World!

    IO.puts(:stderr, "error")
    #=> error

iex(4)></code></pre>



<p>Unlike languages like Python and Ruby, in Elixir you need to use a qualified call that includes the module name in order to invoke this function.</p>



<p>If we don&#8217;t qualify the call, we get a compile error:</p>



<pre class="wp-block-code"><code>iex(4)> puts("Hello, World!")
** (CompileError) iex:1: undefined function puts/1</code></pre>



<p>If you wanted to omit the module name, you could import the module (i.e., <code>import IO</code>) beforehand, or more realistically, limit the import to the function(s) that you need:</p>



<pre class="wp-block-code"><code>import IO, only: [puts: 1]

puts("Hello, World!")</code></pre>



<p>That <code>puts: 1</code> indicates that we are specifically importing the version of the function that accepts one argument. You&#8217;ll see that function referred to as <code>IO.puts/1</code> and we say that it has an arity of one or it&#8217;s a one-arity function.</p>



<p>The arity of a function (i.e., the number of arguments it accepts) is important because in Elixir you can have functions with the same name but different arities.</p>



<p>In fact, there are technically two variants of the <code>puts</code> function, <code>IO.puts/1</code> which we just used, and <code>IO.puts/2</code> which also allows us to specify an IO device as its first argument. (I say <em>technically</em> because they are both declared within a single function definition.)</p>



<p>By default, the function prints to the standard output. This can be easily verified by looking at the Elixir source code:</p>



<pre class="wp-block-code"><code>def puts(device \\ :stdio, item) do
  :io.put_chars(map_dev(device), [to_chardata(item), ?\n])
end</code></pre>



<p>As you can see, the underlying implementation makes a call to the Erlang&#8217;s <code>IO.put_chars/2</code> function.</p>



<p>What&#8217;s worth noting here is that:</p>



<ul class="wp-block-list"><li>In Elixir, default parameters can be specified using <code>\\</code>.</li><li>The default IO device for the function is the atom <code>:stdio</code> which maps to <code>:standard_io</code> in Erlang. Basically, by default, it will print to your stdout unless you pass a different device to the function.</li><li>Another common value for the device is <code>:stderr</code> which is a shortcut for Erlang&#8217;s <code>:standard_error</code>.</li><li>When we call the function without a device (leveraging the default parameter) we are calling <code>IO.puts/1</code>. When we pass it a device as its first argument and a string as its second argument, we are calling <code>IO.puts/2</code>.</li></ul>



<p>In practice, you&#8217;ll most commonly use <code>IO.puts/2</code> when printing to the standard error, instead of the standard output:</p>



<pre class="wp-block-code"><code>iex(5)> IO.puts(:stderr, "Hello, Post-Apocalyptic World!")
Hello, Post-Apocalyptic World!
:ok</code></pre>



<p>For such messages, there are also <code>IO.warn/1</code> (used below) and <code>IO.warn/2</code> which output to the standard error and in addition, include a stacktrace (provided by the developer in the case of <code>IO.warn/2</code>).</p>



<pre class="wp-block-code"><code>iex(6)> IO.warn("Hello, Post-Apocalyptic World!")
warning: Hello, Post-Apocalyptic World!
  (stdlib) erl_eval.erl:680: :erl_eval.do_apply/6
  (elixir) src/elixir.erl:275: :elixir.eval_forms/4
  (iex) lib/iex/evaluator.ex:257: IEx.Evaluator.handle_eval/5
  (iex) lib/iex/evaluator.ex:237: IEx.Evaluator.do_eval/3
  (iex) lib/iex/evaluator.ex:215: IEx.Evaluator.eval/3

:ok</code></pre>



<p>In most cases, parentheses are optional in Elixir, so we could omit them.<sup>2</sup></p>



<pre class="wp-block-code"><code>IO.puts "Hello, World!"</code></pre>



<p>It&#8217;s worth noting that strings are double-quoted literals in Elixir. Unlike other languages, single-quoted literals are not an alternative way of representing strings.</p>



<p>In Elixir, single-quoted literals represent a related but ultimately different datatype (i.e., List), as we can verify by using the handy data type information helper <code>i</code> in IEx:</p>



<pre class="wp-block-code"><code>iex(7)> i "Hello, World!"
Term
  "Hello, World!"
Data type
  BitString
Byte size
  13
Description
  This is a string: a UTF-8 encoded binary. It's printed surrounded by
  "double quotes" because all UTF-8 encoded code points in it are printable.
Raw representation
  &lt;&lt;72, 101, 108, 108, 111, 44, 32, 87, 111, 114, 108, 100, 33>>
Reference modules
  String, :binary
Implemented protocols
  Collectable, IEx.Info, Inspect, List.Chars, String.Chars

iex(8)> i 'Hello, World!'
Term
  'Hello, World!'
Data type
  List
Description
  This is a list of integers that is printed as a sequence of characters
  delimited by single quotes because all the integers in it represent printable
  ASCII characters. Conventionally, a list of Unicode code points is known as a
  charlist and a list of ASCII characters is a subset of it.
Raw representation
  [72, 101, 108, 108, 111, 44, 32, 87, 111, 114, 108, 100, 33]
Reference modules
  List
Implemented protocols
  Collectable, Enumerable, IEx.Info, Inspect, List.Chars, String.Chars
iex(9)></code></pre>



<p>Strings are encoded in UTF-8, so you can use special characters and even emojis.</p>



<pre><code class="prettyprinted">iex(9)&gt; IO.puts("Fabrizio De André")
Fabrizio De André
:ok

iex(10)&gt; IO.puts("Hello, &#127758;!")
Hello, &#127758;!
:ok</code></pre>



<p>It&#8217;s also possible to print a given character by specifying its UTF-8 codepoint.</p>



<pre><code class="prettyprinted">iex(11)> > IO.puts("Hello, #{<<127758 :: utf8>>}!")
Hello, &#127758;!
:ok</code></pre>



<p>If you are familiar with Ruby, you&#8217;ll recognize the same string interpolation syntax in which the expression that needs to be evaluated is enclosed within <code>#{}</code>.</p>



<p>Windows users encountering issues related to special characters can improve their experience by changing the active console code page by executing <code>chcp 65001</code> in their Command Prompt, before executing IEx.</p>



<p>It doesn&#8217;t get any simpler than a Hello World program, but as you can see, if you dig a little deeper, you can find out quite a few things about a given programming language.</p>



<p>In the next Elixir-related post, I&#8217;ll publish a similar discussion for another seemingly trivial problem: the length of a string in Elixir. It will give us an opportunity to discuss strings more deeply. Subscribe, if you don&#8217;t already, to be notified of its publication.</p>



<h2 class="wp-block-heading">Footnotes</h2>



<ol class="wp-block-list"><li>Atoms are constants whose values are their own names. The value of <code>:ok</code> is <code>:ok</code>. If you are familiar with Ruby, atoms are equivalent to symbols.</li><li>A glaring exception to the parentheses being optional are non-qualified/local calls with zero-arity. In those cases, parentheses are required to distinguish simple variables (e.g., <code>user_list</code>) from actual function calls (e.g., <code>list_user()</code>). To avoid ambiguity, it&#8217;s also important to include parentheses with one-arity functions within pipelines (e.g., <code>"Hello, World!" |&gt; IO.puts()</code>). For stylistic recommendations of when to use parentheses and when to omit them, consider reading these two style guides (<a href="https://github.com/christopheradams/elixir_style_guide#parentheses">1</a>, <a href="https://github.com/lexmag/elixir-style-guide#parentheses">2</a>). At any rate, do not obsess over this. <code>mix format</code> or a properly configured editor will typically take care of applying idiomatic and consistent styling for you.</li></ol>
<p>The post <a href="https://programmingzen.com/elixir-hello-world/">Elixir Hello World</a> appeared first on <a href="https://programmingzen.com">Programming Zen</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://programmingzen.com/elixir-hello-world/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2268</post-id>	</item>
	</channel>
</rss>
