[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gfarm-announce:90032] Gfarm version 2.3.2, Gfarm Hadoop Plugin 1.0.0 released



Hi all,

We are pleased to announce the release of

 * Gfarm file system version 2.3.2,
 * Gfarm2fs version 1.2.1,
 * Gfarm GridFTP DSI version 1.0.1, and
 * [NEW] Gfarm Hadoop plug-in version 1.0.0.

		http://sourceforge.net/projects/gfarm/

Major update includes improvement of performance and scheduling of
automatic replica creation option for gfarm2fs.  Gfarm Hadoop plug-in
enables to access Gfarm by Hadoop MapReduce applications.  The
performance evaluation shows better performance than HDFS.  Besides,
no need to copy to data to HDFS anymore.

This release also includes many updates and some bug fixes.  We
strongly recommend to upgrade to this version.

Release note for Gfarm 2.3.2
============================

[2010.7.1]

New Command
* gfsched - schedule and display available file system nodes

New API
* new scheduling APIs - gfarm_schedule_hosts_domain_all,
  gfarm_host_sched_info_free, gfarm_schedule_hosts{,_acyclic}{,_to_write}

New configuration in gfarm2.conf
* no_file_system_node_timeout and gfmd_reconnection_timeout directives
  to specify the timeout to try to find a file system node and to
  reconnect to the gfmd, respectively.  Default is 30 seconds.

Documentation
* manual pages - gfsched(1),
  gfarm(3), gfarm_initialize(3), gfarm_terminate(3), 
  gfs_pio_create(3), gfs_pio_open(3), gfs_pio_close(3), gfs_pio_write(3),
  gfs_pio_read(3)

Updated feature
* support OpenSSL 1.0.0
* support kfreebsd-gnu and linux-gnuabi
* gfmd - check and repair nlinks at start-up
* gfsd - if the input/output error occurs, kill oneself to cope with
  the hardware failure
* gfrm - -f option to force to remove

Bug fix
* gfrep - -x option does not remove excessive number of file replicas
  in case some file replica creations fail
* gfhost - -c/-m/-d without a hostname doesn't cause an error [sf.net
  trac #93]
* gfs_pio_fstat() - may not return correct file size [sf.net trac
  #111]
* gfsd failed to report an error that its hostname is not registered
  in gfmd (i.e. "gfhost -M")
* file close operation is missing in gfsd when a client crashed
  [sf.net trac #2]
* fix data race to calculate total amount of disk usage when a file
  system node is up and down
* fix missing metadata update when GFARM_FILE_TRUNC is specified
  [sf.net trac #103]
* fix missing permission check when GFARM_FILE_RDONLY|GFARM_FILE_TRUNC
  is specified [sf.net trac #107]
* try the next auth method in case of permission denied
* the test program "fsx" causes an assertion failure [sf.net trac
  #102]
* fix compilation errors on FreeBSD 8.0
* fix bashism reported by checkbashisms
* UNIX sockets and their parent directories are not removed when gfsd
  is stopped [sf.net trac #94]

Release note for Gfarm2fs 1.2.1
===============================

[2010.6.29]

Updated feature
* improve performance and scheduling of automatic file replication
* stat() returns correct file size even when some process is editing
  the file

Bug fix
* fix #106 - memory leak in gfarm2fs about symbolic link handling
* release() does not return error

Release note for Gfarm GridFTP DSI 1.0.1
========================================

[2010.6.29]

Updated feature
* use local_user_map and local_group_map to identify the local users
  and local groups

Bug fix
* do not return local uid when the global name and the local account
  name are same since the local uid does not make sense in GSI
  authentication

Gfarm Hadoop Plug-in Introduction
=================================

You can use Hadoop, an open-source MapReduce framework, on Gfarm with this
Hadoop-Gfarm plugin.

Configuration
-------------
The configuration of the core-site.xml for Gfarm Hadoop plug-in;

  <property>
    <name>fs.gfarm.impl</name>
    <value>org.apache.hadoop.fs.gfarmfs.GfarmFileSystem</value>
    <description>The FileSystem for gfarm: uris.</description>
  </property>

Gfarm URL
---------
The Gfarm file system can be accessed by Gfarm URL such as
gfarm:///path/name.  Note that current version cannot specify the
metadata server by the URL.

Here are some examples;

$ ${HADOOP_HOME}/bin/hadoop dfs -ls gfarm:///
$ ${HADOOP_HOME}/bin/hadoop jar hadoop-${VERSION}-examples.jar wordcount gfarm:///input gfarm:///output

When you specify gfarm:/// for the fs.default.name, you do not need to
specify a Gfarm URL but just a file name such as 'input' or 'output'.

$ ${HADOOP_HOME}/bin/hadoop dfs -ls
$ ${HADOOP_HOME}/bin/hadoop jar hadoop-${VERSION}-examples.jar wordcount input output

---
Osamu Tatebe
Department of Computer Science, University of Tsukuba
1-1-1 Tennodai, Tsukuba, Ibaraki 3058577 Japan