ekuric 22:38 on January 11, 2016
Tags: ceph ( 7 ), ceph cluster, ceph OSD

add/remove CEPH OSD – Object Storage Device

In blog post Install CEPH cluster – OS Fedora 23 is described how to setup CEPH storage cluster based on Fedora 23. In that configuration I used only one OSD per CEPH node, in real life you will want to have more OSDs per CEPH node.

OSD stands for Object Storage Device and belongs to main component of CEPH storage cluster. Recommended read CEPH OSD

Adding new OSD is not difficult task, and it can be done via ceph-deploy or by running ceph-disk commands
In ceph cluster I created in previous post, I used kvm, and I as OSD here I am going to use virtual disk attached to KVM machine.

Let’s create disk we are going to use for OSD

 
# qemu-img create -f qcow2 cephf23-node1disk1.qcow2 15G

Attach disk to kvm domain, 1)edit kvm machine .xml domain file and add there definition for new disk, or fire

# virsh edit  kvm_domain

and do very same steps in editor.

Restart kvm guest

# virsh destroy  kvm_doman  
# virsh start  kvm_doman

once machine boots up, new disks which was just added will be visible in this configuration it is marked as /dev/vdc. The name can be different in another configuration.

ADD OSD

First we are going to add OSD using ceph-deploy

ceph-deploy disk zap cephf23-node1:vdc
ceph-deploy osd prepare cephf23-node1:vdc
ceph-deploy osd activate cephf23-node1:/dev/vdc1:/dev/vdc2

Above process of adding new OSD disk is using ceph-deploy which will by default create XFS filesystem on top of OSDs and use it. If you do not want to use XFS, then using below approach will enable us to specify different file system. At time xfs and ext4 are supported,at other side btrfs is experimental and still not wider used in production.

ceph-disk prepare --cluster  --cluster-uuid  --fs-type xfs|ext4|btrfs /device

In this specific case command will be

# parted -s /dev/vdc mklabel gpt 
# ceph-disk prepare --cluster ceph --cluster-uuid b71a3eb1-e253-410a-bf11-84ae01bad654 --fs-type xfs /dev/vdc 
# ceph-disk activate /dev/vdc1

cluster-uuid ( b71a3eb1-e253-410a-bf11-84ae01bad654 )
cluster name – default name is ceph unless specified otherwise when ran ceph-deploy eg ceph-deploy –cluster=cluster_name
check ceph-deploy docs

After this new OSD will be added to ceph cluster. It might happen that cluster is for short is not in HEALTH_OK while it can take some time before PG are rebalanced across new OSD. This is normal process and should be checked if it remains in unhealthy state

REMOVE OSD

In order to remove OSD, first we need to identify OSD we want to remove, ceph osd tree can help, in below output we see all OSDs and osd.3 will be removed in steps afterwards

 
# ceph osd tree
ID WEIGHT  TYPE NAME              UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 0.04997 root default                                             
-2 0.00999     host cephf23-node3                                   
 0 0.00999         osd.0               up  1.00000          1.00000 
-3 0.01999     host cephf23-node2                                   
 1 0.00999         osd.1               up  1.00000          1.00000 
 4 0.00999         osd.4               up  1.00000          1.00000 
-4 0.01999     host cephf23-node1                                   
 2 0.00999         osd.2               up  1.00000          1.00000 
 3 0.00999         osd.3               up  1.00000          1.00000

If osd.3 is picked to be removed, then below steps will help with that

 
# /etc/init.d/ceph stop osd.3
=== osd.3 === 
Stopping Ceph osd.3 on cephf23-node1...done
# ceph osd out osd.3
marked out osd.3. 
# ceph osd down osd.3
marked down osd.3. 
# ceph osd rm osd.3
removed osd.3
# ceph osd crush remove osd.3
removed item id 3 name 'osd.3' from crush map
# ceph auth del osd.3
updated

Now, ceph osd tree will not show osd.3 anymore

]# ceph osd tree
ID WEIGHT  TYPE NAME              UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 0.03998 root default                                             
-2 0.00999     host cephf23-node3                                   
 0 0.00999         osd.0               up  1.00000          1.00000 
-3 0.01999     host cephf23-node2                                   
 1 0.00999         osd.1               up  1.00000          1.00000 
 4 0.00999         osd.4               up  1.00000          1.00000 
-4 0.00999     host cephf23-node1                                   
 2 0.00999         osd.2               up  1.00000          1.00000

#ceph, #ceph-cluster, #ceph-osd

ekuric 17:51 on January 1, 2016
Tags: ceph ( 7 ), ceph pg, ceph pgp, ceph pool, fedora ( 7 ), Linux ( 20 ), object storage device, osd, storage ( 7 )

CEPH storage cluster installation – OS Fedora 23

In this blog post I am going to document steps I did in order to install CEPH storage cluster. Such installed CEPH storage cluster can be used later on in openstack/openshift installation as storage for virtual machines/pods or deploy it with some other solution requesting object and/or block storage. CEPH FS ( File System ) exist, but it will not be discussed in this blog post.

CEPH storage is opensource storage solution which becomes very popular due to its flexibility and features set it offers. Ceph project was started by Sage Weil back in 2007, or so, more at : ceph wiki page, current version of CEPH is Hammer (v0.95) and this version of ceph will be used in this blog post.

As operating system for CEPH cluster, I am going to use Fedora 23, and it will be used due to below reasons

it has good set of features and many available packages. I guess same process we describe here can be used with Debian – with small package/files names corrections
it is close to RHEL Enterprise Linux ( I know I could use Centos 7, but I have Fedora 23 machines handy) , and information you get here can be easily transformed to RHEL ( with prior reading of Red Hat ceph documentation
Note: In order to apply notes from here for RHEL case, you will need to work with Red Hat Sales / Support to get access to proper Red Hat software entitlements which contain CEPH packages
it is free and in order to start working with CEPH you do not need subscriptions in order to get software

I am going to use Fedora 23 KVM environment for this POC, due to below

it is most convenient and cheap – I do not need to invest in hardware
I have access to it

Using KVM as base for CEPH nodes is not supported in production, so be aware of this in case you decide to use RHEL instead of Fedora and if you want to get support for CEPH cluster running on top of RHEL from Red Hat global support services team. Here are in my opinion some interesting links regarding CEPH cluster sizing and planning

As first step is to install Fedora 23, you can use this centos kickstart for this purpose, just adapt it to point to proper Fedora repositories, or you can manually install machines. ISO installation images is possible to get from Fedora 23 server iso images

Once system is installed, I recommend to updated it to latest packages

# dnf -y update

For CEPH storage cluster we need at least 3 CEPH monitors ( mon services ) machines to preserve HA. From excellent CEPH book ( I got it for free,but if you buy it, what I strongly advice, it worth the money you pay for it ) Learning CEPH we can read there A Ceph storage cluster requires at least one monitor to run. For high availability, a Ceph storage cluster relies on an odd number of monitors that’s more than one, for example, 3 or 5, to form a quorum. For this initial POC, I find 3 monitors to be fine,as it will grant HA solution and it will serve POC purpose.Later I am going to add more monitors.

In this test environment instead installing machine with same configuration three times, I installed it once and then used KVM tools I cloned it to desired number, so I had identical KVM guests / machines for CEPH nodes.

After machine installation, we need to ensure below on ceph nodes before doing any further steps

ensure all ceph nodes can properly resolve all nodes, either with configuring dns sever or /etc/hosts. In my case I have dns server already in place and I added my ceph cluster nodes in dns configuration and it worked fine
Important : if ceph nodes are not able to properly resolve other nodes, there will be problem
ensure that ceph nodes have access to internet

With physical hardware,it is expected / necessary to have separate disk for CEPH OSDs. In my test case, as I am using KVM guests, I created for every machine virtual disk using below commands. I decided to use 15 GB disk size,… this is just test POC which can be turn to real POC

# qemu-img create -f qcow2 cephf23-node1disk.qcow2 15G
# qemu-img create -f qcow2 cephf23-node2disk.qcow2 15G
# qemu-img create -f qcow2 cephf23-node3disk.qcow2 15G

after this, in order kvm guests see / use these disks it is necessary to attach them to machines. I edited /etc/libvirt/qemu/kvm_guest_file.xml files for kvm guests and added there definition for new block device. There is already definition for disk in kvm machine .xml file, and it is easy to add new disk, just follow same syntax, while pointing to desired disk and adapting PCI numbers. If there is mistake with configuration it will be reported during virsh define step below

After this, it it necessary to (re)define machine, and restart it

# virsh destroy kvm_machine
# virsh define  /etc/libvirt/qemu/kvm_guest_file.xml  
# virsh start kvm_machine

where kvm_machine is name of your KVM domain/machine. Once kvm guest is up, new disk marked as /dev/vdb will be visible. It is necessary to repeat above process for all guests.

Another option how to add storage to virtual guests is described at documentation Adding storage devices to guests and using for example

 # virsh attach-disk Guest1 /var/lib/libvirt/images/FileName.img vdb --cache none

which is supposed to work too. Part related to disks, is kvm specific, with physical hardware, this is not necessary.

Further, it is necessary to have passwordless login between CEPH nodes, ensure this is working ( ssh-keygen, ssh-copy-id … )

Let’s now proceed and install packages

# dnf install -y ceph-deploy

More about ceph-deploy is possible to find at ceph-deploy ceph documentation. In short ceph-deploy is tool which enables users to install ceph cluster easier than in case it is not used.

Assuming ceph cluster nodes can resolve properly other nodes, and passordless access works fine, issuing below command

# mkdir /etc/ceph
# ceph-deploy new cephf23-node1 cephf23-node2 cephf23-node3

will write ceph.conf file with some basic parameters. In this case

[global]
fsid = b71a3eb1-e253-410a-bf11-84ae01bad654
mon_initial_members = cephf23-node1, cephf23-node2, cephf23-node3 
mon_host = 192.168.122.101,192.168.122.102,192.168.122.103 
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true

I added in ceph.conf additionally

cluster_network = 192.168.122.0/24

to define cluster network. Network for cluster traffic – which will be same as public network, in this case I have only one network card. In ceph.conf we can see mon_initial_members and mon_host which MUST be correct from hostname / ip point of view. If that was not the case, it would be necessary to go back to check DNS settings.

After running ceph-deploy new as showed above ceph.conf will be created, it is necessary to pay particular attention on mon_initial_members and mon_host – there must be correct hostnames / IPs for these machines written in ceph.conf

Now we are ready to install ceph packages

# ceph-deploy install cephf23-node1 cephf23-node2 cephf23-node3  --no-adjust-repos

Last command will install all ceph components on specified nodes. If –no-adjust-repos is not specified, then ceph-deploy will go to cehp repositories and try to get packages from there – and this is not bad if packages exist. At time when I write this, packages for Fedora 20 are available under ceph repositories and ceph-deploy will fail.
Specifying –no-adjust-repos ceph-deploy is instructed not to adjust repositories and to get packages from OS repos – in this case Fedora 23.

Install step will last some time, depending on network speed to download ceph packages.Once it finishes, it is necessary to create monitors

# ceph-deploy --overwrite-conf mon create-initial

Important is to specify –overwrite-conf as ceph.conf was edited and –overwrite-conf will ensure ceph-deploy does not fail.

Now ceph mon dump shows monitors status

# ceph mon dump
dumped monmap epoch 1
epoch 1
fsid b71a3eb1-e253-410a-bf11-84ae01bad654
last_changed 0.000000
created 0.000000
0: 192.168.122.101:6789/0 mon.cephf23-node1
1: 192.168.122.102:6789/0 mon.cephf23-node2
2: 192.168.122.103:6789/0 mon.cephf23-node3

and after this ceph status shows

# ceph status
    cluster b71a3eb1-e253-410a-bf11-84ae01bad654
     health HEALTH_ERR
            64 pgs stuck inactive
            64 pgs stuck unclean
            no osds
     monmap e1: 3 mons at {cephf23-node1=192.168.122.101:6789/0,cephf23-node2=192.168.122.102:6789/0,cephf23-node3=192.168.122.103:6789/0}
            election epoch 30, quorum 0,1,2 cephf23-node1,cephf23-node2,cephf23-node3
     osdmap e1: 0 osds: 0 up, 0 in
      pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects
            0 kB used, 0 kB / 0 kB avail
                  64 creating

Ok,ceph has some health issues. If ceph health is checked, we see

# ceph health
HEALTH_ERR 64 pgs stuck inactive; 64 pgs stuck unclean; no osds

I highlighted most important part in ceph health, and it says no osds. OSD – Object Storage Device is building block of CEPH cluster and it is necessary to create them.

OSD creation is simple,however there are below important points to take care prior OSD creation

identify block device for OSD
ensure it is really device planned for OSD – OSD preparations / actions are destructive and they can cause trouble if pointed to wrong block device
Once block devices are identified, run ceph-deploy disk zap :

 # ceph-deploy disk zap cephf23-node1:vdb
 # ceph-deploy disk zap cephf23-node2:vdb
 # ceph-deploy disk zap cephf23-node3:vdb

ceph-deploy disk zap will do some kind of disk formatting and preparation to be OSD.

After this step prepare OSDs

# ceph-deploy osd prepare cephf23-node1:vdb
# ceph-deploy osd prepare cephf23-node2:vdb
# ceph-deploy osd prepare cephf23-node3:vdb

I created on /dev/vdb all – leaving up to ceph-deploy to divide it to data and journal part eg, if run

# blkid |  grep vdb
/dev/vdb1: UUID="cdf2f55e-67ee-4077-808e-fcfa94f531ae" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="6cb675e2-ed7c-4757-861e-8090b4c3dda3"
/dev/vdb2: PARTLABEL="ceph journal" PARTUUID="b75b2e61-5664-4730-af52-a43a1ee99845"

from this is visible that /dev/vda1 is marked as ceph data and /dev/vdb2 as ceph journal. For maximal and better performance it is recommended to use separate device for ceph journal ( even SSD disk ).

In above example, if there was separate device for journal, then ceph-deploy osd prepare step would be

# ceph-deploy osd prepare {node}:/dev/vdb:/dev/vdX

where vdX would be separate device for journal

Knowing that on /dev/vdb we have now

ceph data
ceph journal

command to activate OSD would be

# ceph-deploy osd activate cephf23-node1:/dev/vdb1:/dev/vdb2
# ceph-deploy osd activate cephf23-node2:/dev/vdb1:/dev/vdb2
# ceph-deploy osd activate cephf23-node3:/dev/vdb1:/dev/vdb2

If we now execute ceph osd tree it will show OSDs and their placement across CEPH nodes.

# ceph osd tree
ID WEIGHT  TYPE NAME              UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 0.02998 root default                                             
-2 0.00999     host cephf23-node3                                   
 0 0.00999         osd.0               up  1.00000          1.00000 
-3 0.00999     host cephf23-node2                                   
 1 0.00999         osd.1               up  1.00000          1.00000 
-4 0.00999     host cephf23-node1                                   
 2 0.00999         osd.2               up  1.00000          1.00000

ceph osd dump will dump osd parameters

# ceph osd dump
epoch 54
fsid b71a3eb1-e253-410a-bf11-84ae01bad654
created 2015-12-27 18:07:12.158247
modified 2015-12-31 16:05:27.526052
flags 
pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
max_osd 3
osd.0 up   in  weight 1 up_from 51 up_thru 51 down_at 50 last_clean_interval [34,43) 192.168.122.103:6800/1081 192.168.122.103:6801/1081 192.168.122.103:6802/1081 192.168.122.103:6803/1081 exists,up bf703535-de09-4bde-b81e-805a9c85169b
osd.1 up   in  weight 1 up_from 48 up_thru 51 down_at 44 last_clean_interval [35,43) 192.168.122.102:6800/1090 192.168.122.102:6801/1090 192.168.122.102:6802/1090 192.168.122.102:6803/1090 exists,up 60d237f0-38c0-4dfd-9672-ef4076c57a7f
osd.2 up   in  weight 1 up_from 46 up_thru 51 down_at 45 last_clean_interval [42,43) 192.168.122.101:6800/1228 192.168.122.101:6801/1228 192.168.122.101:6802/1228 192.168.122.101:6803/1228 exists,up 6cb675e2-ed7c-4757-861e-8090b4c3dda3

And in ps aux | grep ceph-osd we can see there is osd process running – this is for node1, but same applies for other nodes. It applies OSD one block device and this means if we had more OSD on CEPH node, then we should see more osd.X.pid processes in ps aux output

ps aux | grep ceph-osd
root      1226  0.0  0.1 119632  3180 ?        Ss   10:23   0:00 /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph -f
root      1228  0.2  2.2 713224 46884 ?        Sl   10:23   0:23 /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph -f

Running ceph status is run now, it will show status of ceph cluster

# ceph status 
    cluster b71a3eb1-e253-410a-bf11-84ae01bad654
     health HEALTH_OK
     monmap e1: 3 mons at {cephf23-node1=192.168.122.101:6789/0,cephf23-node2=192.168.122.102:6789/0,cephf23-node3=192.168.122.103:6789/0}
            election epoch 32, quorum 0,1,2 cephf23-node1,cephf23-node2,cephf23-node3
     osdmap e13: 3 osds: 3 up, 3 in
      pgmap v54: 64 pgs, 1 pools, 0 bytes data, 0 objects
            100 MB used, 30586 MB / 30686 MB avail
                  64 active+clean

Now we see status of ceph storage cluster, and one might ask why is there 30 GB space available, when 15 GB OSD devices are built into ceph cluster. Now we have

# fdisk -l /dev/vdb
Disk /dev/vdb: 15 GiB, 16106127360 bytes, 31457280 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 893E7CCC-C92A-47DD-9DAE-FC62DFA91314

Device        Start      End  Sectors Size Type
/dev/vdb1  10487808 31457246 20969439  10G Ceph OSD
/dev/vdb2      2048 10485760 10483713   5G Ceph Journal

Partition table entries are not in disk order.

the reason for this is osd journal size which is left default to 5120 MB. More about OSD config parameters osd config reference what means if not specified the size of 5 GB will be used for ceph journal device. This is not issue, in case we specified separate device for journal, then whole device will be used.

From this point it is possible to create ceph pool(s) and use them.

In order to create CEPH pool we can do below

 
# ceph osd pool create poolA 128 128

this will create poolA with designed number of placement groups ( pg ) , what is visible in below output

]# ceph osd lspools
0 rbd,3 poolA

[root@cephf23-node1 ~]# rados lspools
rbd
poolA

General command for creating CEPH pool is

# ceph osd pool create pg_number pgp_number

where pg_number placement group number and pgp_number placement group for placement has to be equal, otherwise CEPH will not start re-balancing

How many PG we need in specific CEPH cluster? This number will vary from number of OSDs and number of replicas,but below formula can help

           (OSDs * 100)
Total PGs = ------------
              Replicas

which is stated on CEPH web How Data Is Stored In CEPH Cluster

Also do not forget to enable ceph service to ensure it starts during boot

# systemctl enable ceph
ceph.service is not a native service, redirecting to systemd-sysv-install
Executing /usr/lib/systemd/systemd-sysv-install enable ceph

This is basic CEPH storage cluster setup. It can be turned into useful POC if instead KVM physical hardware is used (certified and supported HW) and if adapted to other specific needs. It would be also necessary for production case to tweak below points

In above configuration I decided to run monitors and OSD on same machines what will work fine, however this consumes more
memory / cpu on machine and for production environments it is recommended to have monitors and OSDs running on separate machines
separate networks : cluster and public networks
configure firewall – I disabled firewalld in this configuration, and to open ports which are necessary for CEPH all what necessary is to follow ceph network reference manual and apply recommendations from there
CRUSH, what is and how to configure it, read crush and crush paper

What is something is broken? It is possible to do below

try to debug it, /var/log/messages and /var/log/messages/ceph/* on CEPH nodes are locations where to start
Fellow Red Hatter gave an excellent presentation on this topic – Troubleshouting CEPH I recommend to check it.
If not possible to solve, then collect data and open BZ for CEPH

If you suspect that issue might be some kind of PEBKAC then it is possible to start from zero and try to configure it again. ceph-deploy offers simple tools to remove ceph packages and reset ceph nodes to state before ceph packages installation

For example below commands will get system in pre-installation state – they will remove all ceph packages and is possible to try again installation using steps from beginning of this post

 
# ceph-deploy uninstall cephf23-node1 cephf23-node2 cephf23-node3
# ceph-deploy purgedata cephf23-node1 cephf23-node2 cephf23-node3 
# ceph-deploy purge cephf23-node1 cephf23-node2 cephf23-node3
# ceph-deploy forgetkeys

and all is ready to start again. Doing it again, will be good learning experience anyway.

Happy CEPH hacking!

#ceph, #ceph-pg, #ceph-pgp, #ceph-pool, #fedora-2, #linux, #object-storage-device, #osd, #storage

ekuric 11:27 on December 30, 2015
Tags: Linux ( 20 ), sfdisk, storage ( 7 )

copy/edit partition table with sfdisk

sfdisk is nice tool for playing with disk partitions. It has many features, and is very useful when is necessary to do some changes with disk partitions. Before doing anything with sfdisk I recommend reading sfdisk man page to get basic picture what is sfdisk and for what it can be used. If not used carefully, it can be dangerous command, especially if pointed to wrong device so … think before running it
I needed it where was necessary to clone partition table of one sdcard to another ( fdisk can do this too )

To save partition table, I did

 
# sfdisk --dump /dev/sdb > 16gcard

Now in 16gcard dump file was written

# cat 16gcard
label: dos
label-id: 0x00000000
device: /dev/sdb
unit: sectors

/dev/sdb1 : start=        8192, size=    31108096, type=c

This is what I need, however, new card is double in size, so 32 GB and writing above on new card will occupy just first 16 GB. Luckily, sfdisk is very versatile tool and it allows editing partition dump and then writing it back to disk. Open 16gcard in text editor ( eg. Vim ) and edit dump file. If original size is 31108096 * 512 B ( sectors ) then new size would be 61399040 * 512 B (sectors) and new dump file

# cat 16gcard 
label: dos
label-id: 0x00000000
device: /dev/sdb
unit: sectors

/dev/sdb1 : start=        8192, size=    61399040, type=c

Now I can write it to new card

 
# sfdisk /dev/sdb < 16gcard

and fdisk -l shows

#  fdisk -l /dev/sdb
Disk /dev/sdb: 29.3 GiB, 31440502784 bytes, 61407232 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Device     Boot Start      End  Sectors  Size Id Type
/dev/sdb1  *     2048 61407231 61405184 29.3G  c W95 FAT32 (LBA)

What is very same partition table as one I had on old card except last sector which is adapted to suit size of new card.

#linux, #sfdisk, #storage

ekuric 16:19 on August 28, 2011
Tags: udev writting udev rules disk storage

Writing udev rules

udev is since long time ago included in Linux kernel (everything started back in 2003), and here we will not go back and write about history of udev,if you want to know a more about udev it,then do a simple search for it in Google.

Here I am going to write udev rule which will recognize your disk always under same name. The reasons for that,you want to have your USB disk,and all its partitions to be recognized for example as /dev/my_usb and to remain with that name after reboots.

The process described here is tested on RHEL 5, but will work on CentOS, Fedora, … and Debian as well. Writing udev rules is very easy once you get basic understanding what to put inside them. The best starting point for getting information about udev rules are ones already written for your distribution. On RHEL 5 in /etc/udev/rules.d/ you will find a list of udev rules currently present on your system.

For example in /etc/udev/rules.d/50-udev.rules you will find [ here is just part of it ]

KERNEL==”*”, OWNER=”root” GROUP=”root”, MODE=”0600″

all block devices

SUBSYSTEM==”block”, GROUP=”disk”, MODE=”0640″
KERNEL==”root”, GROUP=”disk”, MODE=”0640″

……
……

Every time you plug your external usb/drive ( for printers / cameras / NICs applies the same ) OS will execute actions planned for particular device.If you take a closer look at udev rules in /etc/udev/rules.d/*.rules you will see how how are particular rules written.They are something like

[ as an example ]

ACTION==”add”, SUBSYSTEM==”firmware”, ENV{FIRMWARE}==”*”, RUN=”/sbin/firmware_helper”, OPTIONS=”last_rule”

You now probably wonder from where comes all these options?They come from udev system,as it generate set of actions for devices already connected / newly connected.

For ACTION, SUBSYSTEM, RUN … etc, check

# man udev

So how can one write udev rule? Udev designers made our lives easier,and provided us tools for easier work. As part of udev package is also among other goodies udevinfo

man page for udevinfy says

udevinfo – query device information from the udev database

Fine.

We will not try and see how we can use udevinfo to get more information about particular device we would like to create custom udev rule.In case I know that my disk is marked as /dev/sda, I can run udevinfo to get more information about it

# udevinfo -a -p $(udevinfo -q path -n /dev/sda)

It will produce a very long output,with all necessary information for new udev rule. In my case we have (only part)

looking at device ‘/block/sda’:
KERNEL==”sda”
SUBSYSTEM==”block”
….
….
BUS==”scsi”
DRIVER==”sd”
….
….

We can now do simple copy/paste to our new udev rule.Let’s do that

KERNEL==”sda[1-9]”,SUBSYSTEM==”block”,ACTION==”add”,SYMLINK+=”elvir%n”

Here we instructing udev,to check device sda ( and all partitions on it ),which is on subsystem recognized as block device, add them to system,and create an symbolic link for these original devices (sda1, sda2, sda3…) under /dev as elvir1,elvir2,
Above are some options, you can use also disk serial number in udev rule. For disk I have udevinfo return for serial number

SYSFS{serial}==”FC6E441F8289001C”

probably needles to say that adding it to new udev rule is simple as copying it there

KERNEL==”sda[1-9]”,SUBSYSTEM==”block”, SYSFS{serial}==”FC6E441F8289001C”,ACTION==”add”,SYMLINK+=”elvir%n”

New rule must be under /dev/udev/rules.d/45-custom.rules,and must have extension .rules .Important is to mention that udev rules are read sequentially. rule 05-, will be read before 40-, please check your /etc/udev/rules.d/ and check that all rules have names in format

number-rule.name.rules

Udev rules must be in one line,and do not use “line delimiters” in them

We have now have our udev rule, it is simple,but its purpose is to understand writing more complex ones.More about udev,is possible to find

#man udev

man pages give excellent overview of all above mentioned options and their explanations

#udev-writting-udev-rules-disk-storage

ekuric 21:12 on August 26, 2011
Tags: lvm lvcreate pvcreate storage

lv on specific pv

Logical volumes ( lvm ) are very convenient feature for managing storage space on linux/unix systems. We all know that creating logical volume can be roughly described in next three steps. Let say we have physical disks /dev/sda and /dev/sdb

create physical volumes:

# pvcreate /dev/sda

pvcreate /dev/sdb

create volume group:

# vgcreate myvg /dev/sda /dev/sdb

Next step is to create logical volume on top of newly created logical volume. If we run next

# lvcreate -L 100M -n mylv /dev/myvg

logical volume mylv will be created on from volume group myvg, on so called next free basis — what means it will start logical volume first on /dev/sda filling up space we specified in lvcreate command.

Sometimes we would like to lock down logical volume to be created on specified physical volume, to do that we can run

# lvcreate -L 100 -n mylv /dev/myvg /dev/sda

In last command, we are instructing lvcreate to create logical volume on top of myvg and placing it on physical volume /dev/sda

Running

# lvs -a -o +devices

you can see was above process sucessful. It should show that mylv is scattered over desired pv

#lvm-lvcreate-pvcreate-storage

ekuric 21:54 on July 3, 2011
Tags: luks encryption linux usb

Crypt your USB drives using LUKS

IMPORTANT: Below process can destroy your valuable data. I am not responsible for any kind of data loss. This manual is provided on as-is basis, without any reliability and responsibility. Use at your own risk.

One ( general ) rule you need to have in mind before every manipulation with disks, the rule is very simple – make a backup before any action

It is really important to have have encrypted your usb drives memory sticks, because of possibility to lose them easily and thus some very important ( ever worse private ) data on them.

Encrypting HDD(s) of your desktop/laptop, I consider as must, and please take it seriously and perform drive encrypting during system installation. Almost all ( modern ) Linux distributions offer possibility to encrypt hard drivers at early stage of system installation, so please use that benefit.

[Open|Free|New]BDSs also offer option to crypt hard drivers, but for more detailed information, please visit their respective web locations.

In below text I will write short howto about how to crypt your USB, memory cards using LUKS.

LUKS stands for Linux Unified Key Setup and is good solution when it comes to Linux encryption. So lets start 🙂

First find out how is your disk you want to crypt recognized by your Linux system. Easiest is to check
/var/log/messages after you connect device to system. On my system it is recognized as /dev/sdc

As OS I am at moment using Fedora 14, and to use LUKS you will need to install cryptsetup-luks package.

If you do now have package cryptsetup-luks installed on your system, then run

# yum install cryptsetup-luks

Once cryptsetup-luks is installed, we can proceed and set up our encrypted device

If you are paranoid when it comes to security, then before proceeding you can overwrite you USBs with random data

# dd if=/dev/random of=/dev/sdc

Important: above command can last very long, depending on size of device.

So cryptsetup is installed and we are ready

Create partition on your device

# fdisk /dev/sdc

Run crypsetup on new partition

# cryptsetup luksFormat /dev/sdc1

[root@e-makina elvir]# cryptsetup luksFormat /dev/sdc1

WARNING!
========
This will overwrite data on /dev/sdc1 irrevocably.

Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase:
Verify passphrase:
[root@e-makina elvir]#

When it comes to pick up password, please do not use something as name of your wife, hometown, high school name, or similar stuff. You do not want to have simple passwords for LUKS setup, do not you?

Next step is to open new LUKS encrypted device.

From man cryptsetup

luksOpen

opens the LUKS partition and sets up a mapping after successful verification
of the supplied key material (either via key file by –key-file, or via prompting).

you can see here we will bind our LUKS device ( /dev/sdc1 ) to some friendly name, let say fedoraCrypt. Choosing name I am leaving to your imagination.

# cryptsetup luksOpen /dev/sdc1 fedoraCrypt

After above action under /dev/mapper you will find your device as

ls -l /dev/mapper | grep fedora
lrwxrwxrwx. 1 root root 8 Jul 3 20:47 fedoraCrypt -> ../dm-11

Make file system on your new device /dev/mapper/fedoraCrypt

# mkfs.ext4 /dev/mapper/fedoraCrypt

Mount new device

# mount /dev/mapper/fedoraCrypt /mnt/cryptdevice

And that is.

All what you write to /mnt/cryptdevice will be encrypted. Unmounting encrypted device is simple as

# umount /dev/cryptdevice [ please pay attention it is not busy ]

Next time you connect your usb/memory card to your laptop, you will be prompted to enter password you provided during LUKS device setup

I suppose it is needles to say that above procedure can be applied on internal HDDs of your machine.

In above process I used Fedora 14, same process applies for Debian as cryptsetup-luks package is present in Debian repositories under same name as it is in Fedora

# aptitude search luks
cryptsetup-luks –

Comments are welcome

Thank you

ekuric

#luks-encryption-linux-usb

ekuric 21:51 on June 25, 2011

lvs output decrypting

So you run on your server

# lvs

and get some output, and you would like to know exactly what particular bit means

On my test machine I have

# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
lvgfs2 lvtest -wi-a- 484.00M
lvtest lvtest -wi-ao 524.00M
quorumdisk quorum -wi-a- 20.00M

Here we will spend some time on Attr filed

You can see we have there -wi-a- what has 6 bits. Probably some of you now think it is similar with usual file permission attributes rwx — but not completely. Below are lvm flags decoded

From left to right:

Bit #1
* (m)irrored
* (M)irrored without initial sync
* (o)rigin
* (p)vmove
* (s)napshot
* (S)napshot invalid
* (v)irtual
* (i)mage mirror
* (I)mage out-of-sync
* (c)onversion in progress
Bit #2
* (w)riteable
* (r)ead-only
Bit #3
* (c)ontiguous
* (n)ormal
* c(l)ing
* (a)nywhere
* (i)nherited
Bit #4
* (m)inor
Bit #5
* (a)ctive
* (s)uspended
* (I)nvalid snapshot
* (S)uspended snapshot
* (d)evice present without tables
* (i)nactive table on present device
Bit #6
* (o)pen

so in my case it is

-wi-a- = – writable inherit – active –

elvir's [elvir kuric] blog

Linux and its fancy side

Category Archives: storage