View | Details | Raw Unified | Return to bug 159897 | Differences between
and this patch

Collapse All | Expand All

(-)en_US.ISO8859-1/books/handbook/disks/chapter.sgml (-79 / +77 lines)
Lines 4038-4044 Link Here
4038
    <sect2>
4038
    <sect2>
4039
      <title>Synopsis</title>
4039
      <title>Synopsis</title>
4040
4040
4041
      <para>High-availability is one of the main requirements in serious
4041
      <para>High availability is one of the main requirements in serious
4042
	business applications and highly-available storage is a key
4042
	business applications and highly-available storage is a key
4043
	component in such environments.  Highly Available STorage, or
4043
	component in such environments.  Highly Available STorage, or
4044
	<acronym>HAST<remark role="acronym">Highly Available
4044
	<acronym>HAST<remark role="acronym">Highly Available
Lines 4109-4115 Link Here
4109
	  drives.</para>
4109
	  drives.</para>
4110
	</listitem>
4110
	</listitem>
4111
	<listitem>
4111
	<listitem>
4112
	  <para>File system agnostic, thus allowing to use any file
4112
	  <para>File system agnostic, thus allowing use of any file
4113
	    system supported by &os;.</para>
4113
	    system supported by &os;.</para>
4114
	</listitem>
4114
	</listitem>
4115
	<listitem>
4115
	<listitem>
Lines 4152-4158 Link Here
4152
	total.</para>
4152
	total.</para>
4153
      </note>
4153
      </note>
4154
4154
4155
      <para>Since the <acronym>HAST</acronym> works in
4155
      <para>Since <acronym>HAST</acronym> works in
4156
	primary-secondary configuration, it allows only one of the
4156
	primary-secondary configuration, it allows only one of the
4157
	cluster nodes to be active at any given time.  The
4157
	cluster nodes to be active at any given time.  The
4158
	<literal>primary</literal> node, also called
4158
	<literal>primary</literal> node, also called
Lines 4175-4181 Link Here
4175
      </itemizedlist>
4175
      </itemizedlist>
4176
4176
4177
      <para><acronym>HAST</acronym> operates synchronously on a block
4177
      <para><acronym>HAST</acronym> operates synchronously on a block
4178
	level, which makes it transparent for file systems and
4178
	level, making it transparent to file systems and
4179
	applications.  <acronym>HAST</acronym> provides regular GEOM
4179
	applications.  <acronym>HAST</acronym> provides regular GEOM
4180
	providers in <filename class="directory">/dev/hast/</filename>
4180
	providers in <filename class="directory">/dev/hast/</filename>
4181
	directory for use by other tools or applications, thus there is
4181
	directory for use by other tools or applications, thus there is
Lines 4252-4258 Link Here
4252
	For stripped-down systems, make sure this module is available.
4252
	For stripped-down systems, make sure this module is available.
4253
	Alternatively, it is possible to build
4253
	Alternatively, it is possible to build
4254
	<literal>GEOM_GATE</literal> support into the kernel
4254
	<literal>GEOM_GATE</literal> support into the kernel
4255
	statically, by adding the following line to the custom kernel
4255
	statically, by adding this line to the custom kernel
4256
	configuration file:</para>
4256
	configuration file:</para>
4257
4257
4258
      <programlisting>options	GEOM_GATE</programlisting>
4258
      <programlisting>options	GEOM_GATE</programlisting>
Lines 4290-4299 Link Here
4290
	  class="directory">/dev/hast/</filename>) will be called
4290
	  class="directory">/dev/hast/</filename>) will be called
4291
	<filename><replaceable>test</replaceable></filename>.</para>
4291
	<filename><replaceable>test</replaceable></filename>.</para>
4292
4292
4293
      <para>The configuration of <acronym>HAST</acronym> is being done
4293
      <para>Configuration of <acronym>HAST</acronym> is done
4294
	in the <filename>/etc/hast.conf</filename> file.  This file
4294
	in the <filename>/etc/hast.conf</filename> file.  This file
4295
	should be the same on both nodes.  The simplest configuration
4295
	should be the same on both nodes.  The simplest configuration
4296
	possible is following:</para>
4296
	possible is:</para>
4297
4297
4298
      <programlisting>resource test {
4298
      <programlisting>resource test {
4299
	on hasta {
4299
	on hasta {
Lines 4317-4325 Link Here
4317
	  alternatively in the local <acronym>DNS</acronym>.</para>
4317
	  alternatively in the local <acronym>DNS</acronym>.</para>
4318
      </tip>
4318
      </tip>
4319
4319
4320
      <para>Now that the configuration exists on both nodes, it is
4320
      <para>Now that the configuration exists on both nodes,
4321
	possible to create the <acronym>HAST</acronym> pool.  Run the
4321
	the <acronym>HAST</acronym> pool can be created.  Run these
4322
	following commands on both nodes to place the initial metadata
4322
	commands on both nodes to place the initial metadata
4323
	onto the local disk, and start the &man.hastd.8; daemon:</para>
4323
	onto the local disk, and start the &man.hastd.8; daemon:</para>
4324
4324
4325
      <screen>&prompt.root; <userinput>hastctl create test</userinput>
4325
      <screen>&prompt.root; <userinput>hastctl create test</userinput>
Lines 4334-4384 Link Here
4334
	  available.</para>
4334
	  available.</para>
4335
      </note>
4335
      </note>
4336
4336
4337
      <para>HAST is not responsible for selecting node's role
4337
      <para>A HAST node's role (<literal>primary</literal> or
4338
	(<literal>primary</literal> or <literal>secondary</literal>).
4338
        <literal>secondary</literal>) is selected by an administrator
4339
	Node's role has to be configured by an administrator or other
4339
        or other
4340
	software like <application>Heartbeat</application> using the
4340
        software like <application>Heartbeat</application> using the
4341
	&man.hastctl.8; utility.  Move to the primary node
4341
	&man.hastctl.8; utility.  Move to the primary node
4342
	(<literal><replaceable>hasta</replaceable></literal>) and
4342
	(<literal><replaceable>hasta</replaceable></literal>) and
4343
	issue the following command:</para>
4343
	issue this command:</para>
4344
4344
4345
      <screen>&prompt.root; <userinput>hastctl role primary test</userinput></screen>
4345
      <screen>&prompt.root; <userinput>hastctl role primary test</userinput></screen>
4346
4346
4347
      <para>Similarly, run the following command on the secondary node
4347
      <para>Similarly, run this command on the secondary node
4348
	(<literal><replaceable>hastb</replaceable></literal>):</para>
4348
	(<literal><replaceable>hastb</replaceable></literal>):</para>
4349
4349
4350
      <screen>&prompt.root; <userinput>hastctl role secondary test</userinput></screen>
4350
      <screen>&prompt.root; <userinput>hastctl role secondary test</userinput></screen>
4351
4351
4352
      <caution>
4352
      <caution>
4353
	<para>It may happen that both of the nodes are not able to
4353
	<para>When the nodes are unable to
4354
	  communicate with each other and both are configured as
4354
	  communicate with each other, and both are configured as
4355
	  primary nodes; the consequence of this condition is called
4355
	  primary nodes, the condition is called
4356
	  <literal>split-brain</literal>.  In order to troubleshoot
4356
	  <literal>split-brain</literal>.  To troubleshoot
4357
	  this situation, follow the steps described in <xref
4357
	  this situation, follow the steps described in <xref
4358
	  linkend="disks-hast-sb">.</para>
4358
	  linkend="disks-hast-sb">.</para>
4359
      </caution>
4359
      </caution>
4360
4360
4361
      <para>It is possible to verify the result with the
4361
      <para>Verify the result with the
4362
	&man.hastctl.8; utility on each node:</para>
4362
	&man.hastctl.8; utility on each node:</para>
4363
4363
4364
      <screen>&prompt.root; <userinput>hastctl status test</userinput></screen>
4364
      <screen>&prompt.root; <userinput>hastctl status test</userinput></screen>
4365
4365
4366
      <para>The important text is the <literal>status</literal> line
4366
      <para>The important text is the <literal>status</literal> line,
4367
	from its output and it should say <literal>complete</literal>
4367
	which should say <literal>complete</literal>
4368
	on each of the nodes.  If it says <literal>degraded</literal>,
4368
	on each of the nodes.  If it says <literal>degraded</literal>,
4369
	something went wrong.  At this point, the synchronization
4369
	something went wrong.  At this point, the synchronization
4370
	between the nodes has already started.  The synchronization
4370
	between the nodes has already started.  The synchronization
4371
	completes when the <command>hastctl status</command> command
4371
	completes when <command>hastctl status</command>
4372
	reports 0 bytes of <literal>dirty</literal> extents.</para>
4372
	reports 0 bytes of <literal>dirty</literal> extents.</para>
4373
4373
4374
4374
4375
      <para>The last step is to create a filesystem on the
4375
      <para>The next step is to create a filesystem on the
4376
	<devicename>/dev/hast/<replaceable>test</replaceable></devicename>
4376
	<devicename>/dev/hast/<replaceable>test</replaceable></devicename>
4377
	GEOM provider and mount it.  This has to be done on the
4377
	GEOM provider and mount it.  This must be done on the
4378
	<literal>primary</literal> node (as the
4378
	<literal>primary</literal> node, as
4379
	<filename>/dev/hast/<replaceable>test</replaceable></filename>
4379
	<filename>/dev/hast/<replaceable>test</replaceable></filename>
4380
	appears only on the <literal>primary</literal> node), and
4380
	appears only on the <literal>primary</literal> node.
4381
	it can take a few minutes depending on the size of the hard
4381
	It can take a few minutes depending on the size of the hard
4382
	drive:</para>
4382
	drive:</para>
4383
4383
4384
      <screen>&prompt.root; <userinput>newfs -U /dev/hast/test</userinput>
4384
      <screen>&prompt.root; <userinput>newfs -U /dev/hast/test</userinput>
Lines 4387-4395 Link Here
4387
4387
4388
      <para>Once the <acronym>HAST</acronym> framework is configured
4388
      <para>Once the <acronym>HAST</acronym> framework is configured
4389
	properly, the final step is to make sure that
4389
	properly, the final step is to make sure that
4390
	<acronym>HAST</acronym> is started during the system boot time
4390
	<acronym>HAST</acronym> is started automatically during the system
4391
	automatically.  The following line should be added to the
4391
	boot.  This line is added to
4392
	<filename>/etc/rc.conf</filename> file:</para>
4392
	<filename>/etc/rc.conf</filename>:</para>
4393
4393
4394
      <programlisting>hastd_enable="YES"</programlisting>
4394
      <programlisting>hastd_enable="YES"</programlisting>
4395
4395
Lines 4397-4422 Link Here
4397
	<title>Failover Configuration</title>
4397
	<title>Failover Configuration</title>
4398
4398
4399
	<para>The goal of this example is to build a robust storage
4399
	<para>The goal of this example is to build a robust storage
4400
	  system which is resistant from the failures of any given node.
4400
	  system which is resistant to failures of any given node.
4401
	  The key task here is to remedy a scenario when a
4401
	  The scenario is that a
4402
	  <literal>primary</literal> node of the cluster fails.  Should
4402
	  <literal>primary</literal> node of the cluster fails.  If
4403
	  it happen, the <literal>secondary</literal> node is there to
4403
	  this happens, the <literal>secondary</literal> node is there to
4404
	  take over seamlessly, check and mount the file system, and
4404
	  take over seamlessly, check and mount the file system, and
4405
	  continue to work without missing a single bit of data.</para>
4405
	  continue to work without missing a single bit of data.</para>
4406
4406
4407
	<para>In order to accomplish this task, it will be required to
4407
	<para>To accomplish this task, another &os; feature provides
4408
	  utilize another feature available under &os; which provides
4409
	  for automatic failover on the IP layer &mdash;
4408
	  for automatic failover on the IP layer &mdash;
4410
	  <acronym>CARP</acronym>.  <acronym>CARP</acronym> stands for
4409
	  <acronym>CARP</acronym>.  <acronym>CARP</acronym> (Common Address
4411
	  Common Address Redundancy Protocol and allows multiple hosts
4410
	  Redundancy Protocol) allows multiple hosts
4412
	  on the same network segment to share an IP address.  Set up
4411
	  on the same network segment to share an IP address.  Set up
4413
 	  <acronym>CARP</acronym> on both nodes of the cluster according
4412
 	  <acronym>CARP</acronym> on both nodes of the cluster according
4414
	  to the documentation available in <xref linkend="carp">.
4413
	  to the documentation available in <xref linkend="carp">.
4415
	  After completing this task, each node should have its own
4414
	  After setup, each node will have its own
4416
	  <devicename>carp0</devicename> interface with a shared IP
4415
	  <devicename>carp0</devicename> interface with a shared IP
4417
	  address <replaceable>172.16.0.254</replaceable>.
4416
	  address <replaceable>172.16.0.254</replaceable>.
4418
	  Obviously, the primary <acronym>HAST</acronym> node of the
4417
	  The primary <acronym>HAST</acronym> node of the
4419
	  cluster has to be the master <acronym>CARP</acronym>
4418
	  cluster must be the master <acronym>CARP</acronym>
4420
	  node.</para>
4419
	  node.</para>
4421
4420
4422
	<para>The <acronym>HAST</acronym> pool created in the previous
4421
	<para>The <acronym>HAST</acronym> pool created in the previous
Lines 4430-4446 Link Here
4430
4429
4431
	<para>In the event of <acronym>CARP</acronym> interfaces going
4430
	<para>In the event of <acronym>CARP</acronym> interfaces going
4432
	  up or down, the &os; operating system generates a &man.devd.8;
4431
	  up or down, the &os; operating system generates a &man.devd.8;
4433
	  event, which makes it possible to watch for the state changes
4432
	  event, making it possible to watch for the state changes
4434
	  on the <acronym>CARP</acronym> interfaces.  A state change on
4433
	  on the <acronym>CARP</acronym> interfaces.  A state change on
4435
	  the <acronym>CARP</acronym> interface is an indication that
4434
	  the <acronym>CARP</acronym> interface is an indication that
4436
	  one of the nodes failed or came back online.  In such a case,
4435
	  one of the nodes failed or came back online.  These state change
4437
	  it is possible to run a particular script which will
4436
	  events make it possible to run a script which will
4438
	  automatically handle the failover.</para>
4437
	  automatically handle the failover.</para>
4439
4438
4440
	<para>To be able to catch the state changes on the
4439
	<para>To be able to catch state changes on the
4441
	  <acronym>CARP</acronym> interfaces, the following
4440
	  <acronym>CARP</acronym> interfaces, add this
4442
	  configuration has to be added to the
4441
	  configuration to
4443
	  <filename>/etc/devd.conf</filename> file on each node:</para>
4442
	  <filename>/etc/devd.conf</filename> on each node:</para>
4444
4443
4445
	<programlisting>notify 30 {
4444
	<programlisting>notify 30 {
4446
	match "system" "IFNET";
4445
	match "system" "IFNET";
Lines 4456-4467 Link Here
4456
	action "/usr/local/sbin/carp-hast-switch slave";
4455
	action "/usr/local/sbin/carp-hast-switch slave";
4457
};</programlisting>
4456
};</programlisting>
4458
4457
4459
	<para>To put the new configuration into effect, run the
4458
	<para>Restart &man.devd.8; on both nodes o put the new configuration
4460
	  following command on both nodes:</para>
4459
	  into effect:</para>
4461
4460
4462
	<screen>&prompt.root; <userinput>/etc/rc.d/devd restart</userinput></screen>
4461
	<screen>&prompt.root; <userinput>/etc/rc.d/devd restart</userinput></screen>
4463
4462
4464
	<para>In the event that the <devicename>carp0</devicename>
4463
	<para>When the <devicename>carp0</devicename>
4465
	  interface goes up or down (i.e. the interface state changes),
4464
	  interface goes up or down (i.e. the interface state changes),
4466
	  the system generates a notification, allowing the &man.devd.8;
4465
	  the system generates a notification, allowing the &man.devd.8;
4467
	  subsystem to run an arbitrary script, in this case
4466
	  subsystem to run an arbitrary script, in this case
Lines 4471-4477 Link Here
4471
	  &man.devd.8; configuration, please consult the
4470
	  &man.devd.8; configuration, please consult the
4472
	  &man.devd.conf.5; manual page.</para>
4471
	  &man.devd.conf.5; manual page.</para>
4473
4472
4474
	<para>An example of such a script could be following:</para>
4473
	<para>An example of such a script could be:</para>
4475
4474
4476
<programlisting>#!/bin/sh
4475
<programlisting>#!/bin/sh
4477
4476
Lines 4557-4569 Link Here
4557
	;;
4556
	;;
4558
esac</programlisting>
4557
esac</programlisting>
4559
4558
4560
	<para>In a nutshell, the script does the following when a node
4559
	<para>In a nutshell, the script takes these actions when a node
4561
	  becomes <literal>master</literal> /
4560
	  becomes <literal>master</literal> /
4562
	  <literal>primary</literal>:</para>
4561
	  <literal>primary</literal>:</para>
4563
4562
4564
	<itemizedlist>
4563
	<itemizedlist>
4565
	  <listitem>
4564
	  <listitem>
4566
	    <para>Promotes the <acronym>HAST</acronym> pools as
4565
	    <para>Promotes the <acronym>HAST</acronym> pools to
4567
	      primary on a given node.</para>
4566
	      primary on a given node.</para>
4568
	  </listitem>
4567
	  </listitem>
4569
	  <listitem>
4568
	  <listitem>
Lines 4571-4577 Link Here
4571
	      <acronym>HAST</acronym> pool.</para>
4570
	      <acronym>HAST</acronym> pool.</para>
4572
	  </listitem>
4571
	  </listitem>
4573
	  <listitem>
4572
	  <listitem>
4574
	    <para>Mounts the pools at appropriate place.</para>
4573
	    <para>Mounts the pools at an appropriate place.</para>
4575
	  </listitem>
4574
	  </listitem>
4576
	</itemizedlist>
4575
	</itemizedlist>
4577
4576
Lines 4590-4604 Link Here
4590
4589
4591
	<caution>
4590
	<caution>
4592
	  <para>Keep in mind that this is just an example script which
4591
	  <para>Keep in mind that this is just an example script which
4593
	    should serve as a proof of concept solution.  It does not
4592
	    should serve as a proof of concept.  It does not
4594
	    handle all the possible scenarios and can be extended or
4593
	    handle all the possible scenarios and can be extended or
4595
	    altered in any way, for example it can start/stop required
4594
	    altered in any way, for example it can start/stop required
4596
	    services etc.</para>
4595
	    services, etc.</para>
4597
	</caution>
4596
	</caution>
4598
4597
4599
	<tip>
4598
	<tip>
4600
	  <para>For the purpose of this example we used a standard UFS
4599
	  <para>For this example, we used a standard UFS
4601
	    file system.  In order to reduce the time needed for
4600
	    file system.  To reduce the time needed for
4602
	    recovery, a journal-enabled UFS or ZFS file system can
4601
	    recovery, a journal-enabled UFS or ZFS file system can
4603
	    be used.</para>
4602
	    be used.</para>
4604
	</tip>
4603
	</tip>
Lines 4615-4655 Link Here
4615
      <sect3>
4614
      <sect3>
4616
	<title>General Troubleshooting Tips</title>
4615
	<title>General Troubleshooting Tips</title>
4617
4616
4618
	<para><acronym>HAST</acronym> should be generally working
4617
	<para><acronym>HAST</acronym> should generally work
4619
	  without any issues, however as with any other software
4618
	  without issues.  However, as with any other software
4620
	  product, there may be times when it does not work as
4619
	  product, there may be times when it does not work as
4621
	  supposed.  The sources of the problems may be different, but
4620
	  supposed.  The sources of the problems may be different, but
4622
	  the rule of thumb is to ensure that the time is synchronized
4621
	  the rule of thumb is to ensure that the time is synchronized
4623
	  between all nodes of the cluster.</para>
4622
	  between all nodes of the cluster.</para>
4624
4623
4625
	<para>The debugging level of the &man.hastd.8; should be
4624
	<para>When troubleshooting <acronym>HAST</acronym> problems,
4626
	  increased when troubleshooting <acronym>HAST</acronym>
4625
	  the debugging level of &man.hastd.8; should be increased
4627
	  problems.  This can be accomplished by starting the
4626
	  by starting the
4628
	  &man.hastd.8; daemon with the <literal>-d</literal>
4627
	  &man.hastd.8; daemon with the <literal>-d</literal>
4629
	  argument.  Note, that this argument may be specified
4628
	  argument.  Note that this argument may be specified
4630
	  multiple times to further increase the debugging level.  A
4629
	  multiple times to further increase the debugging level.  A
4631
	  lot of useful information may be obtained this way.  It
4630
	  lot of useful information may be obtained this way.  Consider
4632
	  should be also considered to use <literal>-F</literal>
4631
	  also using the <literal>-F</literal>
4633
	  argument, which will start the &man.hastd.8; daemon in
4632
	  argument, which starts the &man.hastd.8; daemon in the
4634
	  foreground.</para>
4633
	  foreground.</para>
4635
     </sect3>
4634
     </sect3>
4636
4635
4637
      <sect3 id="disks-hast-sb">
4636
      <sect3 id="disks-hast-sb">
4638
	<title>Recovering from the Split-brain Condition</title>
4637
	<title>Recovering from the Split-brain Condition</title>
4639
4638
4640
	<para>The consequence of a situation when both nodes of the
4639
	<para><literal>Split-brain</literal> is when the nodes of the
4641
	  cluster are not able to communicate with each other and both
4640
	  cluster are unable to communicate with each other, and both
4642
	  are configured as primary nodes is called
4641
	  are configured as primary.  This is a dangerous
4643
	  <literal>split-brain</literal>.  This is a dangerous
4644
	  condition because it allows both nodes to make incompatible
4642
	  condition because it allows both nodes to make incompatible
4645
	  changes to the data.  This situation has to be handled by
4643
	  changes to the data.  This problem must be corrected
4646
	  the system administrator manually.</para>
4644
	  manually by the system administrator.</para>
4647
4645
4648
	<para>In order to fix this situation the administrator has to
4646
	<para>The administrator must
4649
	  decide which node has more important changes (or merge them
4647
	  decide which node has more important changes (or merge them
4650
	  manually) and let the <acronym>HAST</acronym> perform
4648
	  manually) and let <acronym>HAST</acronym> perform
4651
	  the full synchronization of the node which has the broken
4649
	  the full synchronization of the node which has the broken
4652
	  data.  To do this, issue the following commands on the node
4650
	  data.  To do this, issue these commands on the node
4653
	  which needs to be resynchronized:</para>
4651
	  which needs to be resynchronized:</para>
4654
4652
4655
        <screen>&prompt.root; <userinput>hastctl role init &lt;resource&gt;</userinput>
4653
        <screen>&prompt.root; <userinput>hastctl role init &lt;resource&gt;</userinput>

Return to bug 159897